[jira] Commented: (LUCENENET-366) Spellchecker issues

2010-05-30 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873430#action_12873430
 ] 

Digy commented on LUCENENET-366:


Hi Ben,
I committed SpellChecker 2.9.2
Can you test it?

DIGY

 Spellchecker issues
 ---

 Key: LUCENENET-366
 URL: https://issues.apache.org/jira/browse/LUCENENET-366
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Ben West
Priority: Minor
 Attachments: LUCENENET-366-spellcheck29.patch, LUCENENET-366.patch, 
 LuceneNet-SpellcheckFixes.patch, spellcheck-2.9-upgrade.patch, 
 spellcheck-29.patch


 There are several issues with the spellchecker:
 - It doesn't do duplicate checking across updates (so the same word is often 
 indexed many, many times)
 - The n-gram fields are stored as well as indexed, which increases the size 
 of the index by several orders of magnitude and provides no benefit
 - Some deprecated functions are used, which slows it down
 - Some methods aren't commented fully
 I will attach a patch that fixes these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2010-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873397#action_12873397
 ] 

Michael McCandless commented on LUCENE-2295:


bq. Further investigantions showed, that there is some difference between using 
this filter/analyzer and the current setting in IndexWriter. IndexWriter uses 
the given MaxFieldLength as maximum value for all instances of the same field 
name. So if you add 100 fields foo (with each 1,000 terms) and have the 
default of 10,000 tokens, DocInverter will index 10 of these field instances 
(10,000 terms in total) and the rest will be supressed.

In LUCENE-2450 I'm experimenting with having multi-valued fields be handled 
entirely by an analyzer stage, ie, the logical concatenation of tokens (with 
gaps) would hidden to IW, and IW would think its dealing with a single token 
stream.  In this model, if you then appended the new LimitTokenCountFilter to 
the end, I think it'd result in the same behavior as maxFieldLength today.

But, even before we eventually switch to that model... can't we still deprecate 
(on 3x) IW's maxFieldLength (remove from trunk) now?  I realize the limiting is 
different (applying the limit pre vs post concatenation), but I think the 
javadocs can explain this difference?  I think it's unlikely apps are relying 
on this specific interaction of truncation and multi-valued fields...

 Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
 same functionality as MaxFieldLength provided on IndexWriter
 ---

 Key: LUCENE-2295
 URL: https://issues.apache.org/jira/browse/LUCENE-2295
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Reporter: Shai Erera
Assignee: Uwe Schindler
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2295-trunk.patch, LUCENE-2295.patch


 A spinoff from LUCENE-2294. Instead of asking the user to specify on 
 IndexWriter his requested MFL limit, we can get rid of this setting entirely 
 by providing an Analyzer which will wrap any other Analyzer and its 
 TokenStream with a TokenFilter that keeps track of the number of tokens 
 produced and stop when the limit has reached.
 This will remove any count tracking in IW's indexing, which is done even if I 
 specified UNLIMITED for MFL.
 Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1934) branch-1.4 should use Lucene 2.9.2

2010-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873400#action_12873400
 ] 

Michael McCandless commented on SOLR-1934:
--

bq. I could do this again (together with 3.0.2 like the last time)?

That sounds great!  Thanks Uwe.

bq. We would need some merging to get 3.0.2 and 2.9.3 same bugfix level?

Yes.  In fact there are some marked-as-4.0 bugs that I'd like to get fixed  
back ported, too.  Eg LUCENE-2311.  I'll go mark that one...

 branch-1.4 should use Lucene 2.9.2
 --

 Key: SOLR-1934
 URL: https://issues.apache.org/jira/browse/SOLR-1934
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.4.1


 The lucene jars on branch-1.4 should be upgraded to 2.9.2 in anticipation of 
 a Solr 1.4.1 release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2283) Possible Memory Leak in StoredFieldsWriter

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2283.


Resolution: Fixed

 Possible Memory Leak in StoredFieldsWriter
 --

 Key: LUCENE-2283
 URL: https://issues.apache.org/jira/browse/LUCENE-2283
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.4.1
Reporter: Tim Smith
Assignee: Michael McCandless
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2283.patch, LUCENE-2283.patch, LUCENE-2283.patch


 StoredFieldsWriter creates a pool of PerDoc instances
 this pool will grow but never be reclaimed by any mechanism
 furthermore, each PerDoc instance contains a RAMFile.
 this RAMFile will also never be truncated (and will only ever grow) (as far 
 as i can tell)
 When feeding documents with large number of stored fields (or one large 
 dominating stored field) this can result in memory being consumed in the 
 RAMFile but never reclaimed. Eventually, each pooled PerDoc could grow very 
 large, even if large documents are rare.
 Seems like there should be some attempt to reclaim memory from the PerDoc[] 
 instance pool (or otherwise limit the size of RAMFiles that are cached) etc

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1934) branch-1.4 should use Lucene 2.9.2

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873404#action_12873404
 ] 

Uwe Schindler commented on SOLR-1934:
-

I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I need to 
merge LUCENE-2319, LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, then we 
have same bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.

 branch-1.4 should use Lucene 2.9.2
 --

 Key: SOLR-1934
 URL: https://issues.apache.org/jira/browse/SOLR-1934
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.4.1


 The lucene jars on branch-1.4 should be upgraded to 2.9.2 in anticipation of 
 a Solr 1.4.1 release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2476) Constructor of IndexWriter let's runtime exceptions pop up, while keeping the writeLock obtained

2010-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2476:
--

Fix Version/s: 2.9.3

 Constructor of IndexWriter let's runtime exceptions pop up, while keeping the 
 writeLock obtained
 

 Key: LUCENE-2476
 URL: https://issues.apache.org/jira/browse/LUCENE-2476
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 3.0.1
Reporter: Cservenak, Tamas
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2476.patch


 Constructor of IndexWriter let's runtime exceptions pop up, while keeping the 
 writeLock obtained.
 The init method in IndexWriter catches IOException only (I got 
 NegativeArraySize by reading up a _corrupt_ index), and now, there is no way 
 to recover, since the writeLock will be kept obtained. Moreover, I don't have 
 IndexWriter instance either, to grab the lock somehow, since the init() 
 method is called from IndexWriter constructor.
 Either broaden the catch to all exceptions, or at least provide some 
 circumvention to clear up. In my case, I'd like to fallback, just delete 
 the corrupted index from disk and recreate it, but it is impossible, since 
 the LOCK_HELD NativeFSLockFactory's entry about obtained WriteLock is _never_ 
 cleaned out and is no (at least apparent) way to clean it out forcibly. I 
 can't create new IndexWriter, since it will always fail with 
 LockObtainFailedException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2319) IndexReader # doCommit - typo nit about v3.0 in trunk

2010-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2319:
--

Fix Version/s: 2.9.3
   (was: 2.9.2)

sorry wrong version

 IndexReader # doCommit - typo nit about v3.0 in trunk
 -

 Key: LUCENE-2319
 URL: https://issues.apache.org/jira/browse/LUCENE-2319
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Kay Kay
Assignee: Michael McCandless
 Fix For: 2.9.3, 3.0.2, 4.0

 Attachments: LUCENE-2319.patch


 Trunk is already in 3.0.1+ . But the documentation says -  In 3.0, this will 
 become ... .  Since it is already in 3.0, it might as well be removed. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2319) IndexReader # doCommit - typo nit about v3.0 in trunk

2010-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2319:
--

Fix Version/s: (was: 2.9.3)

 IndexReader # doCommit - typo nit about v3.0 in trunk
 -

 Key: LUCENE-2319
 URL: https://issues.apache.org/jira/browse/LUCENE-2319
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Kay Kay
Assignee: Michael McCandless
 Fix For: 3.0.2, 4.0

 Attachments: LUCENE-2319.patch


 Trunk is already in 3.0.1+ . But the documentation says -  In 3.0, this will 
 become ... .  Since it is already in 3.0, it might as well be removed. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2319) IndexReader # doCommit - typo nit about v3.0 in trunk

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873406#action_12873406
 ] 

Uwe Schindler commented on LUCENE-2319:
---

Mh removing version again - was a doc update for 3.0 only - cannot be 
backported to 2.9

 IndexReader # doCommit - typo nit about v3.0 in trunk
 -

 Key: LUCENE-2319
 URL: https://issues.apache.org/jira/browse/LUCENE-2319
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Kay Kay
Assignee: Michael McCandless
 Fix For: 3.0.2, 4.0

 Attachments: LUCENE-2319.patch


 Trunk is already in 3.0.1+ . But the documentation says -  In 3.0, this will 
 become ... .  Since it is already in 3.0, it might as well be removed. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2281) Add doBeforeFlush to IndexWriter

2010-05-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2281:
--

Fix Version/s: 2.9.3

 Add doBeforeFlush to IndexWriter
 

 Key: LUCENE-2281
 URL: https://issues.apache.org/jira/browse/LUCENE-2281
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Michael McCandless
 Fix For: 2.9.3, 3.0.2, 4.0

 Attachments: LUCENE-2281.patch


 IndexWriter has doAfterFlush which can be overridden by extensions in order 
 to perform operations after flush has been called. Since flush is final, one 
 can only override doAfterFlush. This issue will handle two things:
 # Make doAfterFlush protected, instead of package-private, to allow for 
 easier extendability of IW.
 # Add doBeforeFlush which will be called by flush before it starts, to allow 
 extensions to perform any operations before flush begings.
 Will post a patch shortly.
 BTW, any chance to get it out in 3.0.1?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1934) branch-1.4 should use Lucene 2.9.2

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873404#action_12873404
 ] 

Uwe Schindler edited comment on SOLR-1934 at 5/30/10 7:22 AM:
--

I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I need to 
merge LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, then we have same 
bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.

  was (Author: thetaphi):
I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I 
need to merge LUCENE-2319, LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, 
then we have same bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.
  
 branch-1.4 should use Lucene 2.9.2
 --

 Key: SOLR-1934
 URL: https://issues.apache.org/jira/browse/SOLR-1934
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.4.1


 The lucene jars on branch-1.4 should be upgraded to 2.9.2 in anticipation of 
 a Solr 1.4.1 release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1934) branch-1.4 should use Lucene 2.9.2

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873404#action_12873404
 ] 

Uwe Schindler edited comment on SOLR-1934 at 5/30/10 7:24 AM:
--

I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I need to 
merge LUCENE-2473, LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, then we 
have same bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.

  was (Author: thetaphi):
I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I 
need to merge LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, then we have 
same bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.
  
 branch-1.4 should use Lucene 2.9.2
 --

 Key: SOLR-1934
 URL: https://issues.apache.org/jira/browse/SOLR-1934
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.4.1


 The lucene jars on branch-1.4 should be upgraded to 2.9.2 in anticipation of 
 a Solr 1.4.1 release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1934) branch-1.4 should use Lucene 2.9.2

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873404#action_12873404
 ] 

Uwe Schindler edited comment on SOLR-1934 at 5/30/10 7:27 AM:
--

I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I need to 
merge  LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, then we have same 
bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.

  was (Author: thetaphi):
I checked commit logs and CHANGES.txt of Lucene 2.9 and 3.0 branches: I 
need to merge LUCENE-2473, LUCENE-2476 and LUCENE-2281 from 3.0 branch to 2.9, 
then we have same bugfix level. Working on it...

Please make sure that you synchronize CHANGES.txt entries to be in same order 
and formatting in both branches.
  
 branch-1.4 should use Lucene 2.9.2
 --

 Key: SOLR-1934
 URL: https://issues.apache.org/jira/browse/SOLR-1934
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.4.1


 The lucene jars on branch-1.4 should be upgraded to 2.9.2 in anticipation of 
 a Solr 1.4.1 release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873408#action_12873408
 ] 

Shai Erera commented on LUCENE-2480:


Solr tests pass too ... dunno what happened, but they started to pass. And the 
dataimport tests do not seem they should be affected by this issue. So I'd like 
to commit this soon.

 Remove support for pre-3.0 indexes
 --

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2480.patch, LUCENE-2480.patch, LUCENE-2480.patch, 
 LUCENE-2480.patch


 We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
 nothing can be done in 3x because there is no special code which handles 1.9, 
 so we'll leave it there. This issue should cover:
 # Remove the .zip indexes
 # Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike 
 suggests we compare the version headers at the top of SegmentInfos, in 2.9.x 
 vs 3.0.x, to see which ones can go.
 # remove FORMAT_PRE from FieldInfos
 # Remove old format from TermVectorsReader
 If you know of other places where code can be removed, then please post a 
 comment here.
 I don't know when I'll have time to handle it, definitely not in the next few 
 days. So if someone wants to take a stab at it, be my guest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2360) speedup recycling of per-doc RAM

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2360:
---

Fix Version/s: 2.9.3
   (was: 4.0)

 speedup recycling of per-doc RAM
 

 Key: LUCENE-2360
 URL: https://issues.apache.org/jira/browse/LUCENE-2360
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 3.1
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9.3

 Attachments: LUCENE-2360.patch


 Robert found one source of slowness when indexing tiny docs, where we use 
 List.toArray to recycle the byte[] buffers used by per-doc doc store state 
 (stored field, term vectors).  This was added in LUCENE-2283, so not yet 
 released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2476) Constructor of IndexWriter let's runtime exceptions pop up, while keeping the writeLock obtained

2010-05-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873409#action_12873409
 ] 

Uwe Schindler commented on LUCENE-2476:
---

Merged to 2.9 revision: 949507


 Constructor of IndexWriter let's runtime exceptions pop up, while keeping the 
 writeLock obtained
 

 Key: LUCENE-2476
 URL: https://issues.apache.org/jira/browse/LUCENE-2476
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 3.0.1
Reporter: Cservenak, Tamas
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2476.patch


 Constructor of IndexWriter let's runtime exceptions pop up, while keeping the 
 writeLock obtained.
 The init method in IndexWriter catches IOException only (I got 
 NegativeArraySize by reading up a _corrupt_ index), and now, there is no way 
 to recover, since the writeLock will be kept obtained. Moreover, I don't have 
 IndexWriter instance either, to grab the lock somehow, since the init() 
 method is called from IndexWriter constructor.
 Either broaden the catch to all exceptions, or at least provide some 
 circumvention to clear up. In my case, I'd like to fallback, just delete 
 the corrupted index from disk and recreate it, but it is impossible, since 
 the LOCK_HELD NativeFSLockFactory's entry about obtained WriteLock is _never_ 
 cleaned out and is no (at least apparent) way to clean it out forcibly. I 
 can't create new IndexWriter, since it will always fail with 
 LockObtainFailedException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2300) IndexWriter should never pool readers for external segments

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2300:
---

Fix Version/s: 2.9.3

 IndexWriter should never pool readers for external segments
 ---

 Key: LUCENE-2300
 URL: https://issues.apache.org/jira/browse/LUCENE-2300
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2300.patch


 EG when addIndexes is called, it enrolls external segment infos, which are 
 then merged.  But merging will simply ask the pool for the readers, and if 
 writer is pooling (NRT reader has been pooled) it incorrectly pools these 
 readers.
 It shouldn't break anything but it's a waste because these readers are only 
 used for merging, once, and they are not opened by NRT reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2480.


  Assignee: Shai Erera
Resolution: Fixed

Thanks Mike and Earwin for your help !

Committed revision 949509.

 Remove support for pre-3.0 indexes
 --

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2480.patch, LUCENE-2480.patch, LUCENE-2480.patch, 
 LUCENE-2480.patch


 We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
 nothing can be done in 3x because there is no special code which handles 1.9, 
 so we'll leave it there. This issue should cover:
 # Remove the .zip indexes
 # Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike 
 suggests we compare the version headers at the top of SegmentInfos, in 2.9.x 
 vs 3.0.x, to see which ones can go.
 # remove FORMAT_PRE from FieldInfos
 # Remove old format from TermVectorsReader
 If you know of other places where code can be removed, then please post a 
 comment here.
 I don't know when I'll have time to handle it, definitely not in the next few 
 days. So if someone wants to take a stab at it, be my guest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2161) Some concurrency improvements for NRT

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2161:
---

Fix Version/s: 2.9.3

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2142) FieldCache.getStringIndex should not throw exception if term count exceeds doc count

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2142:
---

Fix Version/s: 2.9.3

 FieldCache.getStringIndex should not throw exception if term count exceeds 
 doc count
 

 Key: LUCENE-2142
 URL: https://issues.apache.org/jira/browse/LUCENE-2142
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0


 Spinoff of LUCENE-2133/LUCENE-831.
 Currently FieldCache cannot handle more than one value per field.
 We may someday want to fix that... but until that day:
 FieldCache.getStringIndex currently does a simplistic check to try to
 catch when you've accidentally allowed more than one term per field,
 by testing if the number of unique terms exceeds the number of
 documents.
 The problem is, this is not a perfect check, in that it allows false
 negatives (you could have more than one term per field for some docs
 and the check won't catch you).
 Further, the exception thrown is the unchecked RuntimeException.
 So this means... you could happily think all is good, until some day,
 well into production, once you've updated enough docs, suddenly the
 check will catch you and throw an unhandled exception, stopping all
 searches [that need to sort by this string field] in their tracks.
 It's not gracefully degrading.
 I think we should simply remove the test, ie, if you have more terms
 than docs then the terms simply overwrite one another.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2141) Make String and StringIndex in field cache more RAM efficient

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2141.


Fix Version/s: (was: 4.0)
   Resolution: Duplicate

Dup of LUCENE-2380.

 Make String and StringIndex in field cache more RAM efficient
 -

 Key: LUCENE-2141
 URL: https://issues.apache.org/jira/browse/LUCENE-2141
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Michael McCandless

 Once flex has landed and LUCENE-1990 is done, we should improve the RAM 
 efficiency of String and StringIndex.
 The text data can be stored in native UTF8 (saves decode when loading), and 
 as byte[] blocks (saves GC load and high RAM overhead of individual strings).
 And with packed unsigned ints we can save alot for cases that don't have that 
 many unique string values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2481) Enhance SnapshotDeletionPolicy to allow taking multiple snapshots

2010-05-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873418#action_12873418
 ] 

Shai Erera commented on LUCENE-2481:


Another thing about IDs - they are more user-level API than IndexCommit.

So Earwin, I agree that we can have snapshot() and release(IndexCommit) to 
achieve the same functionality (need to be careful w/ multiple snapshots over 
the same IC). But for users, passing in an IndexCommit is not too friendly. 
Also, one customer of the API already uses the ID to encode some information 
that's interesting to him (e.g. name of the process + timestamp) which shows 
why IDs should remain. It's kind of like the commitUserData given to commits. 
Another reason is sharing the ID between two different code segments. It's 
easier to share a String-ID, than to share an IndexCommit. And what exactly is 
IndexCommit, and how does it translate to a key? Just the segmentsFileName? See 
- that's a too low level implementation detail IMO ...

 Enhance SnapshotDeletionPolicy to allow taking multiple snapshots
 -

 Key: LUCENE-2481
 URL: https://issues.apache.org/jira/browse/LUCENE-2481
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2481-3x.patch


 A spin off from here: 
 http://www.gossamer-threads.com/lists/lucene/java-dev/99161?do=post_view_threaded#99161
 I will:
 # Replace snapshot() with snapshot(String), so that one can name/identify the 
 snapshot
 # Add some supporting methods, like release(String), getSnapshots() etc.
 # Some unit tests of course.
 This is mostly written already - I want to contribute it. I've also written a 
 PersistentSDP, which persists the snapshots on stable storage (a Lucene index 
 in this case) to support opening an IW with existing snapshots already, so 
 they don't get deleted. If it's interesting, I can contribute it as well.
 Porting my patch to the new API. Should post it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2468) reopen on NRT reader should share readers w/ unchanged segments

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2468:
---

Fix Version/s: 2.9.3

 reopen on NRT reader should share readers w/ unchanged segments
 ---

 Key: LUCENE-2468
 URL: https://issues.apache.org/jira/browse/LUCENE-2468
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Yonik Seeley
Assignee: Michael McCandless
 Fix For: 2.9.3, 3.1, 4.0

 Attachments: CacheTest.java, DeletionAwareConstantScoreQuery.java, 
 LUCENE-2468.patch, LUCENE-2468.patch, LUCENE-2468.patch


 A repoen on an NRT reader doesn't seem to share readers for those segments 
 that are unchanged.
 http://search.lucidimagination.com/search/document/9f0335d480d2e637/nrt_and_caching_based_on_indexreader

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2135) IndexReader.close should forcefully evict entries from FieldCache

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2135:



backport

 IndexReader.close should forcefully evict entries from FieldCache
 -

 Key: LUCENE-2135
 URL: https://issues.apache.org/jira/browse/LUCENE-2135
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2135.patch, LUCENE-2135.patch, LUCENE-2135.patch


 Spinoff of java-user thread heap memory issues when sorting by a string 
 field.
 We rely on WeakHashMap to hold our FieldCache, keyed by reader.  But this 
 lacks immediacy on releasing the reference, after a reader is closed.
 WeakHashMap can't free the key until the reader is no longer referenced by 
 the app. And, apparently, WeakHashMap has a further impl detail that requires 
 invoking one of its methods for it to notice that a key has just become only 
 weakly reachable.
 To fix this, I think on IR.close we should evict entries from the FieldCache, 
 as long as the sub-readers are truly closed (refCount dropped to 0).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2135) IndexReader.close should forcefully evict entries from FieldCache

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2135:
---

Fix Version/s: 2.9.3

 IndexReader.close should forcefully evict entries from FieldCache
 -

 Key: LUCENE-2135
 URL: https://issues.apache.org/jira/browse/LUCENE-2135
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2135.patch, LUCENE-2135.patch, LUCENE-2135.patch


 Spinoff of java-user thread heap memory issues when sorting by a string 
 field.
 We rely on WeakHashMap to hold our FieldCache, keyed by reader.  But this 
 lacks immediacy on releasing the reference, after a reader is closed.
 WeakHashMap can't free the key until the reader is no longer referenced by 
 the app. And, apparently, WeakHashMap has a further impl detail that requires 
 invoking one of its methods for it to notice that a key has just become only 
 weakly reachable.
 To fix this, I think on IR.close we should evict entries from the FieldCache, 
 as long as the sub-readers are truly closed (refCount dropped to 0).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2119) If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit unexpected NegativeArraySizeException

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2119:



backport

 If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit 
 unexpected NegativeArraySizeException
 --

 Key: LUCENE-2119
 URL: https://issues.apache.org/jira/browse/LUCENE-2119
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2119.patch


 Note that this is a nonsense value to pass in, since our PQ impl allocates 
 the array up front.
 It's because PQ takes 1+ this value (which wraps to -1), and attempts to 
 allocate that.  We should bounds check it, and drop PQ size by one in this 
 case.
 Better, maybe: in IndexSearcher, if that n is ever  maxDoc(), set it to 
 maxDoc().
 This trips users up fairly often because they assume our PQ doesn't 
 statically pre-allocate (a reasonable assumption...).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2119) If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit unexpected NegativeArraySizeException

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2119:
---

Fix Version/s: 2.9.3

 If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit 
 unexpected NegativeArraySizeException
 --

 Key: LUCENE-2119
 URL: https://issues.apache.org/jira/browse/LUCENE-2119
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2119.patch


 Note that this is a nonsense value to pass in, since our PQ impl allocates 
 the array up front.
 It's because PQ takes 1+ this value (which wraps to -1), and attempts to 
 allocate that.  We should bounds check it, and drop PQ size by one in this 
 case.
 Better, maybe: in IndexSearcher, if that n is ever  maxDoc(), set it to 
 maxDoc().
 This trips users up fairly often because they assume our PQ doesn't 
 statically pre-allocate (a reasonable assumption...).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2060) CMS should default its maxThreadCount to 1 (not 3)

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2060:



backport

 CMS should default its maxThreadCount to 1 (not 3)
 --

 Key: LUCENE-2060
 URL: https://issues.apache.org/jira/browse/LUCENE-2060
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0


 From rough experience, I think the current default of 3 is too large.  I 
 think we get the most bang for the buck going from 0 to 1.
 I think this will especially impact optimize on an index with many segments 
 -- in this case the MergePolicy happily exposes concurrency (multiple pending 
 merges), and CMS will happily launch 3 threads to carry that out.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2046) IndexReader.isCurrent incorrectly returns false after writer.prepareCommit has been called

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2046:



backport

 IndexReader.isCurrent incorrectly returns false after writer.prepareCommit 
 has been called
 --

 Key: LUCENE-2046
 URL: https://issues.apache.org/jira/browse/LUCENE-2046
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-2046.patch


 Spinoff from thread 2 phase commit with external data on java-user.
 The IndexReader should not see the index as changed, after a prepareCommit 
 has been called but before commit is called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2046) IndexReader.isCurrent incorrectly returns false after writer.prepareCommit has been called

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2046:
---

Fix Version/s: 2.9.3

 IndexReader.isCurrent incorrectly returns false after writer.prepareCommit 
 has been called
 --

 Key: LUCENE-2046
 URL: https://issues.apache.org/jira/browse/LUCENE-2046
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0

 Attachments: LUCENE-2046.patch


 Spinoff from thread 2 phase commit with external data on java-user.
 The IndexReader should not see the index as changed, after a prepareCommit 
 has been called but before commit is called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2424) FieldDoc.toString only returns super.toString

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2424:



backport

 FieldDoc.toString only returns super.toString
 -

 Key: LUCENE-2424
 URL: https://issues.apache.org/jira/browse/LUCENE-2424
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 3.0.1
 Environment: Mac OSX
Reporter: Stephen Green
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2424.patch


 The FieldDoc.toString method very carefully builds a StringBuffer sb 
 containing the information for the FieldDoc instance and then just returns 
 super.toString() instead of sb.toString()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2424) FieldDoc.toString only returns super.toString

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2424:
---

Fix Version/s: 2.9.3
   3.0.2
   3.1

 FieldDoc.toString only returns super.toString
 -

 Key: LUCENE-2424
 URL: https://issues.apache.org/jira/browse/LUCENE-2424
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 3.0.1
 Environment: Mac OSX
Reporter: Stephen Green
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2424.patch


 The FieldDoc.toString method very carefully builds a StringBuffer sb 
 containing the information for the FieldDoc instance and then just returns 
 super.toString() instead of sb.toString()

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2421) Hardening of NativeFSLock

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2421:



backport

 Hardening of NativeFSLock
 -

 Key: LUCENE-2421
 URL: https://issues.apache.org/jira/browse/LUCENE-2421
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2421.patch, LUCENE-2421.patch, LUCENE-2421.patch, 
 LUCENE-2421.patch


 NativeFSLock create a test lock file which its name might collide w/ another 
 JVM that is running. Very unlikely, but still it happened a couple of times 
 already, since the tests were parallelized. This may result in a false 
 exception thrown from release(), when the lock file's delete() is called and 
 returns false, because the file does not exist (deleted by another JVM 
 already). In addition, release() should give a second attempt to delete() if 
 it fails, since the file may be held temporarily by another process (like 
 AntiVirus) before it fails. The proposed changes are:
 1) Use ManagementFactory.getRuntimeMXBean().getName() as part of the test 
 lock name (should include the process Id)
 2) In release(), if delete() fails, check if the file indeed exists. If it 
 is, let's attempt a re-delete() few ms later.
 3) If (3) still fails, throw an exception. Alternatively, we can attempt a 
 deleteOnExit.
 I'll post a patch later today.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2417) Fix IndexCommit hashCode() and equals() to be consistent

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2417:



backport

 Fix IndexCommit hashCode() and equals() to be consistent
 

 Key: LUCENE-2417
 URL: https://issues.apache.org/jira/browse/LUCENE-2417
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2417.patch


 IndexCommit's impl of hashCode() and equals() is inconsistent. One uses Dir + 
 version and the other uses Dir + equals. According to hashCode()'s javadoc, 
 if o1.equals(o2), then o1.hashCode() == o2.hashCode(). Simple fix, and I'll 
 add a test case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2417) Fix IndexCommit hashCode() and equals() to be consistent

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2417:
---

Fix Version/s: 2.9.3
   3.0.2

 Fix IndexCommit hashCode() and equals() to be consistent
 

 Key: LUCENE-2417
 URL: https://issues.apache.org/jira/browse/LUCENE-2417
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2417.patch


 IndexCommit's impl of hashCode() and equals() is inconsistent. One uses Dir + 
 version and the other uses Dir + equals. According to hashCode()'s javadoc, 
 if o1.equals(o2), then o1.hashCode() == o2.hashCode(). Simple fix, and I'll 
 add a test case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2104) IndexWriter.unlock does does nothing if NativeFSLockFactory is used

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2104:
---

Fix Version/s: 2.9.3
   3.0.2

 IndexWriter.unlock does does nothing if NativeFSLockFactory is used
 ---

 Key: LUCENE-2104
 URL: https://issues.apache.org/jira/browse/LUCENE-2104
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 2.9, 2.9.1, 3.0
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2104.patch, LUCENE-2104.patch, LUCENE-2104.patch


 If NativeFSLockFactory is used, IndexWriter.unlock will return, silently 
 doing nothing. The reason is that NativeFSLockFactory's makeLock always 
 creates a new NativeFSLock. NativeFSLock's release first checks if its lock 
 is not null. However, only if obtain() is called, that lock is not null. So 
 release actually does nothing, and so IndexWriter.unlock does not delete the 
 lock, or fail w/ exception.
 This is only a problem in NativeFSLock, and not in other Lock 
 implementations, at least as I was able to see.
 Need to think first how to reproduce in a test, and then fix it. I'll work on 
 it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2104) IndexWriter.unlock does does nothing if NativeFSLockFactory is used

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2104:



backport

 IndexWriter.unlock does does nothing if NativeFSLockFactory is used
 ---

 Key: LUCENE-2104
 URL: https://issues.apache.org/jira/browse/LUCENE-2104
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 2.9, 2.9.1, 3.0
Reporter: Shai Erera
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2104.patch, LUCENE-2104.patch, LUCENE-2104.patch


 If NativeFSLockFactory is used, IndexWriter.unlock will return, silently 
 doing nothing. The reason is that NativeFSLockFactory's makeLock always 
 creates a new NativeFSLock. NativeFSLock's release first checks if its lock 
 is not null. However, only if obtain() is called, that lock is not null. So 
 release actually does nothing, and so IndexWriter.unlock does not delete the 
 lock, or fail w/ exception.
 This is only a problem in NativeFSLock, and not in other Lock 
 implementations, at least as I was able to see.
 Need to think first how to reproduce in a test, and then fix it. I'll work on 
 it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2397) SnapshotDeletionPolicy.snapshot() throws NPE if no commits happened

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2397:
---

Fix Version/s: 2.9.3
   3.0.2

 SnapshotDeletionPolicy.snapshot() throws NPE if no commits happened
 ---

 Key: LUCENE-2397
 URL: https://issues.apache.org/jira/browse/LUCENE-2397
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2397.patch


 SDP throws NPE if no commits occurred and snapshot() was called. I will 
 replace it w/ throwing IllegalStateException. I'll also move TestSDP from 
 o.a.l to o.a.l,index. I'll post a patch soon

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2397) SnapshotDeletionPolicy.snapshot() throws NPE if no commits happened

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2397:



backport

 SnapshotDeletionPolicy.snapshot() throws NPE if no commits happened
 ---

 Key: LUCENE-2397
 URL: https://issues.apache.org/jira/browse/LUCENE-2397
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0

 Attachments: LUCENE-2397.patch


 SDP throws NPE if no commits occurred and snapshot() was called. I will 
 replace it w/ throwing IllegalStateException. I'll also move TestSDP from 
 o.a.l to o.a.l,index. I'll post a patch soon

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2356) Enable setting the terms index divisor used by IndexWriter whenever it opens internal readers

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2356:
---

Fix Version/s: 2.9.3
   3.0.2
   3.1

 Enable setting the terms index divisor used by IndexWriter whenever it opens 
 internal readers
 -

 Key: LUCENE-2356
 URL: https://issues.apache.org/jira/browse/LUCENE-2356
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0


 Opening a place holder issue... if all the refactoring being discussed don't 
 make this possible, then we should add a setting to IWC to do so.
 Apps with very large numbers of unique terms must set the terms index divisor 
 to control RAM usage.
 (NOTE: flex's RAM terms dict index RAM usage is more efficient, so this will 
 help such apps).
 But, when IW resolves deletes internally it always uses default 1 terms index 
 divisor, and the app cannot change that.  Though one workaround is to call 
 getReader(termInfosIndexDivisor) which will pool the reader with the right 
 divisor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2300) IndexWriter should never pool readers for external segments

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2300:



backport

 IndexWriter should never pool readers for external segments
 ---

 Key: LUCENE-2300
 URL: https://issues.apache.org/jira/browse/LUCENE-2300
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2300.patch


 EG when addIndexes is called, it enrolls external segment infos, which are 
 then merged.  But merging will simply ask the pool for the readers, and if 
 writer is pooling (NRT reader has been pooled) it incorrectly pools these 
 readers.
 It shouldn't break anything but it's a waste because these readers are only 
 used for merging, once, and they are not opened by NRT reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r949525 - in /lucene/dev/trunk/lucene/src/java/org/apache/lucene: analysis/CharTokenizer.java util/VirtualMethod.java

2010-05-30 Thread Uwe Schindler
What was the reason for the changes in VirtualMethod?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: k...@apache.org [mailto:k...@apache.org]
 Sent: Sunday, May 30, 2010 5:02 PM
 To: comm...@lucene.apache.org
 Subject: svn commit: r949525 - in
 /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
 analysis/CharTokenizer.java util/VirtualMethod.java
 
 Author: koji
 Date: Sun May 30 15:02:06 2010
 New Revision: 949525
 
 URL: http://svn.apache.org/viewvc?rev=949525view=rev
 Log:
 fix typo
 
 Modified:
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
 er.java
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
 va
 
 Modified:
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
 er.java
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apach
 e/lucene/analysis/CharTokenizer.java?rev=949525r1=949524r2=949525v
 iew=diff
 ==
 
 ---
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
 er.java (original)
 +++
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
 er.java Sun May 30 15:02:06 2010
 @@ -237,7 +237,7 @@ public abstract class CharTokenizer exte
 * /p
 */
protected boolean isTokenChar(int c) {
 -throw new UnsupportedOperationException(since LUCENE_3_1
 subclasses of CharTokenizer must implement isTokenChar(int));
 +throw new UnsupportedOperationException(since LUCENE_31
 subclasses of CharTokenizer must implement isTokenChar(int));
}
 
/**
 
 Modified:
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
 va
 URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apach
 e/lucene/util/VirtualMethod.java?rev=949525r1=949524r2=949525view
 =diff
 ==
 
 ---
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
 va (original)
 +++
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
 va Sun May 30 15:02:06 2010
 @@ -83,8 +83,8 @@ public final class VirtualMethodC {
VirtualMethod instances must be singletons and therefore  +
assigned to static final members in the same class, they use as
 baseClass ctor param.
  );
 -} catch (NoSuchMethodException nsme) {
 -  throw new IllegalArgumentException(baseClass.getName() +  has no
 such method: +nsme.getMessage());
 +} catch (NoSuchMethodException name) {
 +  throw new IllegalArgumentException(baseClass.getName() +  has no
 such method: +name.getMessage());
  }
}
 
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-2045) FNFE hit when creating an empty index and infoStream is on

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-2045:



backport

 FNFE hit when creating an empty index and infoStream is on
 --

 Key: LUCENE-2045
 URL: https://issues.apache.org/jira/browse/LUCENE-2045
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0


 Shai just reported this on the dev list.  Simple test:
 {code}
 Directory dir = new RAMDirectory();
 IndexWriter writer = new IndexWriter(dir, new SimpleAnalyzer(), 
 MaxFieldLength.UNLIMITED);
 writer.setInfoStream(System.out);
 writer.addDocument(new Document());
 writer.commit();
 writer.close();
 {code}
 hits this:
 {code}
 Exception in thread main java.io.FileNotFoundException: _0.prx
 at org.apache.lucene.store.RAMDirectory.fileLength(RAMDirectory.java:149)
 at 
 org.apache.lucene.index.DocumentsWriter.segmentSize(DocumentsWriter.java:1150)
 at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:587)
 at 
 org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3572)
 at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3483)
 at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3474)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1940)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1894)
 {code}
 Turns out it's just silly -- this is actually an issue I've already fixed on 
 the flex (LUCENE-1458) branch.  DocumentsWriter has its own method to 
 enumerate the flushed files and compute their size, but really it shouldn't 
 do that -- it should use SegmentInfo's method, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2045) FNFE hit when creating an empty index and infoStream is on

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2045:
---

Fix Version/s: 2.9.3
   3.0.2
   3.1

 FNFE hit when creating an empty index and infoStream is on
 --

 Key: LUCENE-2045
 URL: https://issues.apache.org/jira/browse/LUCENE-2045
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0


 Shai just reported this on the dev list.  Simple test:
 {code}
 Directory dir = new RAMDirectory();
 IndexWriter writer = new IndexWriter(dir, new SimpleAnalyzer(), 
 MaxFieldLength.UNLIMITED);
 writer.setInfoStream(System.out);
 writer.addDocument(new Document());
 writer.commit();
 writer.close();
 {code}
 hits this:
 {code}
 Exception in thread main java.io.FileNotFoundException: _0.prx
 at org.apache.lucene.store.RAMDirectory.fileLength(RAMDirectory.java:149)
 at 
 org.apache.lucene.index.DocumentsWriter.segmentSize(DocumentsWriter.java:1150)
 at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:587)
 at 
 org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3572)
 at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3483)
 at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3474)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1940)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1894)
 {code}
 Turns out it's just silly -- this is actually an issue I've already fixed on 
 the flex (LUCENE-1458) branch.  DocumentsWriter has its own method to 
 enumerate the flushed files and compute their size, but really it shouldn't 
 do that -- it should use SegmentInfo's method, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r949525 - in /lucene/dev/trunk/lucene/src/java/org/apache/lucene: analysis/CharTokenizer.java util/VirtualMethod.java

2010-05-30 Thread Koji Sekiguchi

Uh, sorry, I thought you typed 's' instead of 'a'. (next key of 'a')
Now I'm aware of nsme stands for NoSuchMethodException...

(10/05/31 0:37), Uwe Schindler wrote:

What was the reason for the changes in VirtualMethod?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: k...@apache.org [mailto:k...@apache.org]
Sent: Sunday, May 30, 2010 5:02 PM
To: comm...@lucene.apache.org
Subject: svn commit: r949525 - in
/lucene/dev/trunk/lucene/src/java/org/apache/lucene:
analysis/CharTokenizer.java util/VirtualMethod.java

Author: koji
Date: Sun May 30 15:02:06 2010
New Revision: 949525

URL: http://svn.apache.org/viewvc?rev=949525view=rev
Log:
fix typo

Modified:

lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
er.java

lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
va

Modified:
lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
er.java
URL:
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apach
e/lucene/analysis/CharTokenizer.java?rev=949525r1=949524r2=949525v
iew=diff
==

---
lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
er.java (original)
+++
lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharTokeniz
er.java Sun May 30 15:02:06 2010
@@ -237,7 +237,7 @@ public abstract class CharTokenizer exte
 */p
 */
protected boolean isTokenChar(int c) {
-throw new UnsupportedOperationException(since LUCENE_3_1
subclasses of CharTokenizer must implement isTokenChar(int));
+throw new UnsupportedOperationException(since LUCENE_31
subclasses of CharTokenizer must implement isTokenChar(int));
}

/**

Modified:
lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
va
URL:
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apach
e/lucene/util/VirtualMethod.java?rev=949525r1=949524r2=949525view
=diff
==

---
lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
va (original)
+++
lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod.ja
va Sun May 30 15:02:06 2010
@@ -83,8 +83,8 @@ public final class VirtualMethodC  {
VirtualMethod instances must be singletons and therefore  +
assigned to static final members in the same class, they use as
baseClass ctor param.
  );
-} catch (NoSuchMethodException nsme) {
-  throw new IllegalArgumentException(baseClass.getName() +  has no
such method: +nsme.getMessage());
+} catch (NoSuchMethodException name) {
+  throw new IllegalArgumentException(baseClass.getName() +  has no
such method: +name.getMessage());
  }
}


 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


   



--
http://www.rondhuit.com/en/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2045) FNFE hit when creating an empty index and infoStream is on

2010-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2045.


Resolution: Fixed

Oh, already fixed on 2.9.x, 3.0.x.

 FNFE hit when creating an empty index and infoStream is on
 --

 Key: LUCENE-2045
 URL: https://issues.apache.org/jira/browse/LUCENE-2045
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 3.0.2, 3.1, 4.0


 Shai just reported this on the dev list.  Simple test:
 {code}
 Directory dir = new RAMDirectory();
 IndexWriter writer = new IndexWriter(dir, new SimpleAnalyzer(), 
 MaxFieldLength.UNLIMITED);
 writer.setInfoStream(System.out);
 writer.addDocument(new Document());
 writer.commit();
 writer.close();
 {code}
 hits this:
 {code}
 Exception in thread main java.io.FileNotFoundException: _0.prx
 at org.apache.lucene.store.RAMDirectory.fileLength(RAMDirectory.java:149)
 at 
 org.apache.lucene.index.DocumentsWriter.segmentSize(DocumentsWriter.java:1150)
 at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:587)
 at 
 org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3572)
 at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3483)
 at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3474)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1940)
 at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1894)
 {code}
 Turns out it's just silly -- this is actually an issue I've already fixed on 
 the flex (LUCENE-1458) branch.  DocumentsWriter has its own method to 
 enumerate the flushed files and compute their size, but really it shouldn't 
 do that -- it should use SegmentInfo's method, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Push for a Solr 1.4.1 Bug Fix Release?

2010-05-30 Thread Robert Muir
On Sat, May 29, 2010 at 7:19 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 Alright cool ... it would have been nice to get at least one more person
 to chime in and say I'll help test


if you make an RC, i will help try to break it.

-- 
Robert Muir
rcm...@gmail.com


RE: svn commit: r949525 - in /lucene/dev/trunk/lucene/src/java/org/apache/lucene: analysis/CharTokenizer.java util/VirtualMethod.java

2010-05-30 Thread Uwe Schindler
I just wondered why you changed a totally unrelated thing...

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Koji Sekiguchi [mailto:k...@r.email.ne.jp]
 Sent: Sunday, May 30, 2010 5:50 PM
 To: dev@lucene.apache.org
 Subject: Re: svn commit: r949525 - in
 /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
 analysis/CharTokenizer.java util/VirtualMethod.java
 
 Uh, sorry, I thought you typed 's' instead of 'a'. (next key of 'a') Now I'm
 aware of nsme stands for NoSuchMethodException...
 
 (10/05/31 0:37), Uwe Schindler wrote:
  What was the reason for the changes in VirtualMethod?
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
 
  -Original Message-
  From: k...@apache.org [mailto:k...@apache.org]
  Sent: Sunday, May 30, 2010 5:02 PM
  To: comm...@lucene.apache.org
  Subject: svn commit: r949525 - in
  /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
  analysis/CharTokenizer.java util/VirtualMethod.java
 
  Author: koji
  Date: Sun May 30 15:02:06 2010
  New Revision: 949525
 
  URL: http://svn.apache.org/viewvc?rev=949525view=rev
  Log:
  fix typo
 
  Modified:
 
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
  iz
  er.java
 
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
  .ja
  va
 
  Modified:
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
  iz
  er.java
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apa
  ch
 
 e/lucene/analysis/CharTokenizer.java?rev=949525r1=949524r2=949525v
  iew=diff
 
 ==
  
  ---
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
  iz
  er.java (original)
  +++
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
  iz
  er.java Sun May 30 15:02:06 2010
  @@ -237,7 +237,7 @@ public abstract class CharTokenizer exte
   */p
   */
  protected boolean isTokenChar(int c) {
  -throw new UnsupportedOperationException(since LUCENE_3_1
  subclasses of CharTokenizer must implement isTokenChar(int));
  +throw new UnsupportedOperationException(since LUCENE_31
  subclasses of CharTokenizer must implement isTokenChar(int));
  }
 
  /**
 
  Modified:
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
  .ja
  va
  URL:
 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apa
  ch
 
 e/lucene/util/VirtualMethod.java?rev=949525r1=949524r2=949525view
  =diff
 
 ==
  
  ---
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
  .ja
  va (original)
  +++
 
 lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
  .ja
  va Sun May 30 15:02:06 2010
  @@ -83,8 +83,8 @@ public final class VirtualMethodC  {
  VirtualMethod instances must be singletons and therefore  +
  assigned to static final members in the same class, they
  use as baseClass ctor param.
);
  -} catch (NoSuchMethodException nsme) {
  -  throw new IllegalArgumentException(baseClass.getName() +  has no
  such method: +nsme.getMessage());
  +} catch (NoSuchMethodException name) {
  +  throw new IllegalArgumentException(baseClass.getName() +  has
  + no
  such method: +name.getMessage());
}
  }
 
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 
 --
 http://www.rondhuit.com/en/
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1852) enablePositionIncrements=true can cause searches to fail when they are parsed as phrase queries

2010-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873454#action_12873454
 ] 

Robert Muir commented on SOLR-1852:
---

I am willing to do the backport here if people want this in 1.4.1, just let me 
know.


 enablePositionIncrements=true can cause searches to fail when they are 
 parsed as phrase queries
 -

 Key: SOLR-1852
 URL: https://issues.apache.org/jira/browse/SOLR-1852
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Peter Wolanin
Assignee: Robert Muir
 Attachments: SOLR-1852.patch, SOLR-1852_testcase.patch


 Symptom: searching for a string like a domain name containing a '.', the Solr 
 1.4 analyzer tells me that I will get a match, but when I enter the search 
 either in the client or directly in Solr, the search fails. 
 test string:  Identi.ca
 queries that fail:  IdentiCa, Identi.ca, Identi-ca
 query that matches: Identi ca
 schema in use is:
 http://drupalcode.org/viewvc/drupal/contributions/modules/apachesolr/schema.xml?revision=1.1.2.1.2.34content-type=text%2Fplainview=copathrev=DRUPAL-6--1
 Screen shots:
 analysis:  http://img.skitch.com/20100327-nt1uc1ctykgny28n8bgu99h923.png
 dismax search: http://img.skitch.com/20100327-byiduuiry78caka7q5smsw7fp.png
 dismax search: http://img.skitch.com/20100327-gckm8uhjx3t7px31ygfqc2ugdq.png
 standard search: http://img.skitch.com/20100327-usqyqju1d12ymcpb2cfbtdwyh.png
 Whether or not the bug appears is determined by the surrounding text:
 would be great to have support for Identi.ca on the follow block
 fails to match Identi.ca, but putting the content on its own or in another 
 sentence:
 Support Identi.ca
 the search matches.  Testing suggests the word for is the problem, and it 
 looks like the bug occurs when a stop word preceeds a word that is split up 
 using the word delimiter filter.
 Setting enablePositionIncrements=false in the stop filter and reindexing 
 causes the searches to match.
 According to Mark Miller in #solr, this bug appears to be fixed already in 
 Solr trunk, either due to the upgraded lucene or changes to the 
 WordDelimiterFactory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-1935) BaseResponseWriter neglects to add SolrDocument in DocList isStreamingDocs=false

2010-05-30 Thread Chris A. Mattmann (JIRA)
BaseResponseWriter neglects to add SolrDocument in DocList isStreamingDocs=false


 Key: SOLR-1935
 URL: https://issues.apache.org/jira/browse/SOLR-1935
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.5
 Environment: working on SOLR-1925, i noticed this
Reporter: Chris A. Mattmann


There is a bug near line 126/127 in BaseResponseWriter.java in the 
isStreamingDocs() == false section for the DocList case. The SorlDocuments 
aren't being added back to the list object for return. I noticed this while I 
was working on SOLR-1925. Simple patch to fix, attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1935) BaseResponseWriter neglects to add SolrDocument in DocList isStreamingDocs=false

2010-05-30 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1935:


Attachment: SOLR-1935.Mattmann.053010.patch.txt

 BaseResponseWriter neglects to add SolrDocument in DocList 
 isStreamingDocs=false
 

 Key: SOLR-1935
 URL: https://issues.apache.org/jira/browse/SOLR-1935
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.5
 Environment: working on SOLR-1925, i noticed this
Reporter: Chris A. Mattmann
 Attachments: SOLR-1935.Mattmann.053010.patch.txt


 There is a bug near line 126/127 in BaseResponseWriter.java in the 
 isStreamingDocs() == false section for the DocList case. The SorlDocuments 
 aren't being added back to the list object for return. I noticed this while I 
 was working on SOLR-1925. Simple patch to fix, attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1925:


Attachment: SOLR-1925.Mattmann.053010.patch.txt

Okey dok, here's the patch, I'll post some sample queries and response writer 
config to show how it's used in one sec.

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: SolrCloud integration roadmap

2010-05-30 Thread Simon Willnauer
On Sun, May 30, 2010 at 6:03 PM, olivier sallou
olivier.sal...@gmail.com wrote:
 Hi,
 I'd like to know when SolrCloud feature will be released in Solr ? I saw a
 Jira track about this to integrate in trunk but I cannot see related
 roadmap.
The patch might be integrated into trunk shortly I assume - a release
isn't that near right now for various reasons.
If you really need this feature you should probably join and help
pushing it forwards, there is lots of work to do on the indexing side
of things.

simon
 I definitly need this feature I was going to develop myself as an additional
 layer above Solr (at least partially for my needs) just before reading a
 wiki article about it.

 Regards

 Olivier


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873457#action_12873457
 ] 

Chris A. Mattmann commented on SOLR-1925:
-

Hey Guys:

Here are some samples on how to call it:

This example queries Solr for children's hospital, turns on CSV output, and 
requests the fields site_id and agency_name
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_name;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then changes the default 
delimeter to semi-colon (only in the context of this request)
{code}
curlhttp://localhost:8080/solr211/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_namedelimeter=;;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then specifies (by turning 
Excel off) that CR LF should be left inside of the fields and not replaced 
(only in the context of this request):

{code}
curlhttp://localhost:8080/solr211/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_nameexcel=false;
{code}

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873457#action_12873457
 ] 

Chris A. Mattmann edited comment on SOLR-1925 at 5/30/10 1:01 PM:
--

Hey Guys:

Here are some samples on how to call it:

This example queries Solr for children's hospital, turns on CSV output, and 
requests the fields site_id and agency_name
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_name;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then changes the default 
delimeter to semi-colon (only in the context of this request)
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_namedelimeter=;;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then specifies (by turning 
Excel off) that CR LF should be left inside of the fields and not replaced 
(only in the context of this request):

{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_nameexcel=false;
{code}

  was (Author: chrismattmann):
Hey Guys:

Here are some samples on how to call it:

This example queries Solr for children's hospital, turns on CSV output, and 
requests the fields site_id and agency_name
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_name;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then changes the default 
delimeter to semi-colon (only in the context of this request)
{code}
curlhttp://localhost:8080/solr211/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_namedelimeter=;;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then specifies (by turning 
Excel off) that CR LF should be left inside of the fields and not replaced 
(only in the context of this request):

{code}
curlhttp://localhost:8080/solr211/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_nameexcel=false;
{code}
  
 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1852) enablePositionIncrements=true can cause searches to fail when they are parsed as phrase queries

2010-05-30 Thread Peter Wolanin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873461#action_12873461
 ] 

Peter Wolanin commented on SOLR-1852:
-


Yes, I'd propose to have this in 1.4.1 since it's a pretty serious bug in the 
places where it manifests.

 enablePositionIncrements=true can cause searches to fail when they are 
 parsed as phrase queries
 -

 Key: SOLR-1852
 URL: https://issues.apache.org/jira/browse/SOLR-1852
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Peter Wolanin
Assignee: Robert Muir
 Attachments: SOLR-1852.patch, SOLR-1852_testcase.patch


 Symptom: searching for a string like a domain name containing a '.', the Solr 
 1.4 analyzer tells me that I will get a match, but when I enter the search 
 either in the client or directly in Solr, the search fails. 
 test string:  Identi.ca
 queries that fail:  IdentiCa, Identi.ca, Identi-ca
 query that matches: Identi ca
 schema in use is:
 http://drupalcode.org/viewvc/drupal/contributions/modules/apachesolr/schema.xml?revision=1.1.2.1.2.34content-type=text%2Fplainview=copathrev=DRUPAL-6--1
 Screen shots:
 analysis:  http://img.skitch.com/20100327-nt1uc1ctykgny28n8bgu99h923.png
 dismax search: http://img.skitch.com/20100327-byiduuiry78caka7q5smsw7fp.png
 dismax search: http://img.skitch.com/20100327-gckm8uhjx3t7px31ygfqc2ugdq.png
 standard search: http://img.skitch.com/20100327-usqyqju1d12ymcpb2cfbtdwyh.png
 Whether or not the bug appears is determined by the surrounding text:
 would be great to have support for Identi.ca on the follow block
 fails to match Identi.ca, but putting the content on its own or in another 
 sentence:
 Support Identi.ca
 the search matches.  Testing suggests the word for is the problem, and it 
 looks like the bug occurs when a stop word preceeds a word that is split up 
 using the word delimiter filter.
 Setting enablePositionIncrements=false in the stop filter and reindexing 
 causes the searches to match.
 According to Mark Miller in #solr, this bug appears to be fixed already in 
 Solr trunk, either due to the upgraded lucene or changes to the 
 WordDelimiterFactory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r949525 - in /lucene/dev/trunk/lucene/src/java/org/apache/lucene: analysis/CharTokenizer.java util/VirtualMethod.java

2010-05-30 Thread Uwe Schindler
DONE!

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Sunday, May 30, 2010 7:17 PM
 To: dev@lucene.apache.org
 Subject: RE: svn commit: r949525 - in
 /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
 analysis/CharTokenizer.java util/VirtualMethod.java
 
 Can you revert it again (only VirtualMethod)?
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Koji Sekiguchi [mailto:k...@r.email.ne.jp]
  Sent: Sunday, May 30, 2010 5:50 PM
  To: dev@lucene.apache.org
  Subject: Re: svn commit: r949525 - in
  /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
  analysis/CharTokenizer.java util/VirtualMethod.java
 
  Uh, sorry, I thought you typed 's' instead of 'a'. (next key of 'a')
  Now I'm aware of nsme stands for NoSuchMethodException...
 
  (10/05/31 0:37), Uwe Schindler wrote:
   What was the reason for the changes in VirtualMethod?
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
  
   -Original Message-
   From: k...@apache.org [mailto:k...@apache.org]
   Sent: Sunday, May 30, 2010 5:02 PM
   To: comm...@lucene.apache.org
   Subject: svn commit: r949525 - in
   /lucene/dev/trunk/lucene/src/java/org/apache/lucene:
   analysis/CharTokenizer.java util/VirtualMethod.java
  
   Author: koji
   Date: Sun May 30 15:02:06 2010
   New Revision: 949525
  
   URL: http://svn.apache.org/viewvc?rev=949525view=rev
   Log:
   fix typo
  
   Modified:
  
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
   iz
   er.java
  
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
   .ja
   va
  
   Modified:
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
   iz
   er.java
   URL:
  
  http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apa
   ch
  
 
 e/lucene/analysis/CharTokenizer.java?rev=949525r1=949524r2=949525v
   iew=diff
  
 
 ==
   
   ---
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
   iz
   er.java (original)
   +++
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/analysis/CharToken
   iz
   er.java Sun May 30 15:02:06 2010
   @@ -237,7 +237,7 @@ public abstract class CharTokenizer exte
*/p
*/
   protected boolean isTokenChar(int c) {
   -throw new UnsupportedOperationException(since LUCENE_3_1
   subclasses of CharTokenizer must implement isTokenChar(int));
   +throw new UnsupportedOperationException(since LUCENE_31
   subclasses of CharTokenizer must implement isTokenChar(int));
   }
  
   /**
  
   Modified:
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
   .ja
   va
   URL:
  
  http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apa
   ch
  
 
 e/lucene/util/VirtualMethod.java?rev=949525r1=949524r2=949525view
   =diff
  
 
 ==
   
   ---
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
   .ja
   va (original)
   +++
  
  lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/VirtualMethod
   .ja
   va Sun May 30 15:02:06 2010
   @@ -83,8 +83,8 @@ public final class VirtualMethodC  {
   VirtualMethod instances must be singletons and therefore  +
   assigned to static final members in the same class, they
   use as baseClass ctor param.
 );
   -} catch (NoSuchMethodException nsme) {
   -  throw new IllegalArgumentException(baseClass.getName() +  has
 no
   such method: +nsme.getMessage());
   +} catch (NoSuchMethodException name) {
   +  throw new IllegalArgumentException(baseClass.getName() +  has
   + no
   such method: +name.getMessage());
 }
   }
  
  
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
   additional commands, e-mail: dev-h...@lucene.apache.org
  
  
  
 
 
  --
  http://www.rondhuit.com/en/
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-30 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873465#action_12873465
 ] 

Earwin Burrfoot commented on LUCENE-2480:
-

Wow! So fast! :)

bq. You didn't remove the .zip indexes?
Your patch didn't remove them for me, patch file format can't handle binary 
stuff.

bq. You removed the code from TestBackwardsCompatibilty .. So I don't think 
we should remove that piece of code.
Several lines later there sits a piece of absolutely identical commented-out 
code. The only difference is that it doesn't use preLockless in method names.
If someone really needs it later, the code I left over suits him no less than 
the code I removed.

bq. TestIndexFileDeleter failed 'cause of deletable file - fixed it.
Cool! It's very strange I missed the failure.

bq. Renamed FORMAT_FLEX_POSTING to FORMAT_4_0
FORMAT_4_0_FLEX_POSTINGS? I loved the self-descriptiveness :)

And, yup. Bye-bye to the byte.

 Remove support for pre-3.0 indexes
 --

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2480.patch, LUCENE-2480.patch, LUCENE-2480.patch, 
 LUCENE-2480.patch


 We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
 nothing can be done in 3x because there is no special code which handles 1.9, 
 so we'll leave it there. This issue should cover:
 # Remove the .zip indexes
 # Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike 
 suggests we compare the version headers at the top of SegmentInfos, in 2.9.x 
 vs 3.0.x, to see which ones can go.
 # remove FORMAT_PRE from FieldInfos
 # Remove old format from TermVectorsReader
 If you know of other places where code can be removed, then please post a 
 comment here.
 I don't know when I'll have time to handle it, definitely not in the next few 
 days. So if someone wants to take a stab at it, be my guest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-30 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873466#action_12873466
 ] 

Shai Erera commented on LUCENE-2480:


bq. Your patch didn't remove them for me, patch file format can't handle binary 
stuff.

Strange, there were lines in my patch file which indicated they are removed, 
but were absent from yours. Anyway, they're removed now.

bq. Several lines later there sits a piece of absolutely identical 
commented-out code. 

I don't see it.  All I see is a comment and two methods that are used to 
generate the old indexes.

bq. FORMAT_4_0_FLEX_POSTINGS

While I don't mind that sort of descriptiveness, it does not fit in all cases, 
such as this case -- this format is related to both flex postings, but also to 
that byte removed from the SegmentInfo :). So I think we should keep the names 
simple, and have a useful javadoc. It's the sort of constants no one really 
cares about in his daily work, only when you handle file format backwards stuff 
:).

Also, if by chance (like this case, again) you get to change the file format 
version twice, by two different people, with long time interval between the 
two, a FORMAT_4_0 should alert you that that's used for the unreleased index, 
and so you don't need to bump it up again (here, jumping from -9 to -11) 
accidentally.

 Remove support for pre-3.0 indexes
 --

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2480.patch, LUCENE-2480.patch, LUCENE-2480.patch, 
 LUCENE-2480.patch


 We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
 nothing can be done in 3x because there is no special code which handles 1.9, 
 so we'll leave it there. This issue should cover:
 # Remove the .zip indexes
 # Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike 
 suggests we compare the version headers at the top of SegmentInfos, in 2.9.x 
 vs 3.0.x, to see which ones can go.
 # remove FORMAT_PRE from FieldInfos
 # Remove old format from TermVectorsReader
 If you know of other places where code can be removed, then please post a 
 comment here.
 I don't know when I'll have time to handle it, definitely not in the next few 
 days. So if someone wants to take a stab at it, be my guest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-30 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873469#action_12873469
 ] 

Earwin Burrfoot commented on LUCENE-2480:
-

bq. Strange, there were lines in my patch file which indicated they are 
removed, but were absent from yours.
{code}
Index: lucene/src/test/org/apache/lucene/index/index.19.cfs.zip
===
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: lucene/src/test/org/apache/lucene/index/index.19.nocfs.zip
===
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Index: lucene/src/test/org/apache/lucene/index/index.20.cfs.zip
===
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
{code}
These lines in your patch mean that diff noticed the differences, but failed to 
express them - nothing more. You could modify all these files instead of 
deleting, and get the same result.

bq. I don't see it.
Ctrl+F, testCreateCFS
Ok, they're not completely identical. But I don't really care either way.

bq.  FORMAT stuff .
Ok.

 Remove support for pre-3.0 indexes
 --

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-2480.patch, LUCENE-2480.patch, LUCENE-2480.patch, 
 LUCENE-2480.patch


 We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
 nothing can be done in 3x because there is no special code which handles 1.9, 
 so we'll leave it there. This issue should cover:
 # Remove the .zip indexes
 # Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike 
 suggests we compare the version headers at the top of SegmentInfos, in 2.9.x 
 vs 3.0.x, to see which ones can go.
 # remove FORMAT_PRE from FieldInfos
 # Remove old format from TermVectorsReader
 If you know of other places where code can be removed, then please post a 
 comment here.
 I don't know when I'll have time to handle it, definitely not in the next few 
 days. So if someone wants to take a stab at it, be my guest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: SolrCloud integration roadmap

2010-05-30 Thread Mark Miller

On 5/30/10 6:03 PM, olivier sallou wrote:

Hi,
I'd like to know when SolrCloud feature will be released in Solr ? I saw
a Jira track about this to integrate in trunk but I cannot see related
roadmap.
I definitly need this feature I was going to develop myself as an
additional layer above Solr (at least partially for my needs) just
before reading a wiki article about it.

Regards

Olivier


Hey Olivier - Just got back from vacation and I hope to get the first 
phase of SolrCloud committed to trunk very soon - I'll see if I can't 
make it happen this week.


--
- Mark

http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2161) Some concurrency improvements for NRT

2010-05-30 Thread Shay Banon (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873475#action_12873475
 ] 

Shay Banon commented on LUCENE-2161:


Mike, is there a reason why this is not backported to 3.0.2?

 Some concurrency improvements for NRT
 -

 Key: LUCENE-2161
 URL: https://issues.apache.org/jira/browse/LUCENE-2161
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9.3, 4.0

 Attachments: LUCENE-2161.patch


 Some concurrency improvements for NRT
 I found  fixed some silly thread bottlenecks that affect NRT:
   * Multi/DirectoryReader.numDocs is synchronized, I think so only 1
 thread computes numDocs if it's -1.  I removed this sync, and made
 numDocs volatile, instead.  Yes, multiple threads may compute the
 numDocs for the first time, but I think that's harmless?
   * Fixed BitVector's ctor to set count to 0 on creating a new BV, and
 clone to copy the count over; this saves CPU computing the count
 unecessarily.
   * Also strengthened assertions done in SR, testing the delete docs
 count.
 I also found an annoying thread bottleneck that happens, due to CMS.
 Whenever CMS hits the max running merges (default changed from 3 to 1
 recently), and the merge policy now wants to launch another merge, it
 forces the incoming thread to wait until one of the BG threads
 finishes.
 This is a basic crude throttling mechanism -- you force the mutators
 (whoever is causing new segments to appear) to stop, so that merging
 can catch up.
 Unfortunately, when stressing NRT, that thread is the one that's
 opening a new NRT reader.
 So, the first serious problem happens when you call .reopen() on your
 NRT reader -- this call simply forwards to IW.getReader if the reader
 was an NRT reader.  But, because DirectoryReader.doReopen is
 synchronized, this had the horrible effect of holding the monitor lock
 on your main IR.  In my test, this blocked all searches (since each
 search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
 I fixed this by making doReopen only sync'd on this if it's not simply
 forwarding to getWriter.  So that's a good step forward.
 This prevents searches from being blocked while trying to reopen to a
 new NRT.
 However... it doesn't fix the problem that when an immense merge is
 off and running, opening an NRT reader could hit a tremendous delay
 because CMS blocks it.  The BalancedSegmentMergePolicy should help
 here... by avoiding such immense merges.
 But, I think we should also pursue an improvement to CMS.  EG, if it
 has 2 merges running, where one is huge and one is tiny, it ought to
 increase thread priority of the tiny one.  I think with such a change
 we could increase the max thread count again, to prevent this
 starvation.  I'll open a separate issue

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1925:


Attachment: SOLR-1925.Mattmann.053010.patch.2.txt

- small bug in my initial patch where the field col headers didn't respect the 
delimeter parameter. Fixed.

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.2.txt, 
 SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873496#action_12873496
 ] 

Erik Hatcher commented on SOLR-1925:


shouldn't that be spelled delimiter?  ;)   or we hossifying this thing?

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.2.txt, 
 SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873498#action_12873498
 ] 

Chris A. Mattmann commented on SOLR-1925:
-

haha crap i can't spell. Hold on let me fix it... ;) I caught some typos in the 
Javadoc too, so fixing those too!

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.2.txt, 
 SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1925:


Attachment: SOLR-1925.Mattmann.053010.patch.3.txt

Fix typos: nice catch, Erik!

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.2.txt, 
 SOLR-1925.Mattmann.053010.patch.3.txt, SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1925) CSV Response Writer

2010-05-30 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873457#action_12873457
 ] 

Chris A. Mattmann edited comment on SOLR-1925 at 5/30/10 5:54 PM:
--

Hey Guys:

Here are some samples on how to call it:

This example queries Solr for children's hospital, turns on CSV output, and 
requests the fields site_id and agency_name
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_name;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then changes the default 
delimiter to semi-colon (only in the context of this request)
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_namedelimiter=;;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then specifies (by turning 
Excel off) that CR LF should be left inside of the fields and not replaced 
(only in the context of this request):

{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_nameexcel=false;
{code}

  was (Author: chrismattmann):
Hey Guys:

Here are some samples on how to call it:

This example queries Solr for children's hospital, turns on CSV output, and 
requests the fields site_id and agency_name
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_name;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then changes the default 
delimeter to semi-colon (only in the context of this request)
{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_namedelimeter=;;
{code}

This example queries Solr for children's hospital and turns on CSV output, 
requests the fields site_id and agency_name, and then specifies (by turning 
Excel off) that CR LF should be left inside of the fields and not replaced 
(only in the context of this request):

{code}
curlhttp://localhost:8080/solr/select/?q=children%27s%20AND%20hospitalversion=2.2start=0rows=10indent=onwt=csvfl=site_id,id,agency_nameexcel=false;
{code}
  
 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: Next

 Attachments: SOLR-1925.Mattmann.053010.patch.2.txt, 
 SOLR-1925.Mattmann.053010.patch.3.txt, SOLR-1925.Mattmann.053010.patch.txt


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Push for a Solr 1.4.1 Bug Fix Release?

2010-05-30 Thread Koji Sekiguchi

(10/05/30 14:08), Chris Hostetter wrote:

FYI...

: ## 9 Bugs w/fixes on the 1.5 branch that seem serious enough
: ## that they warrant a 1.4.1 bug-fix release...

...those 9 bugs have been merged to branch-1.4.  I'll work on the
remainders listed below (which includes upgrading the lucene jars) tomorow
or monday

: https://issues.apache.org/jira/browse/SOLR-1522
: https://issues.apache.org/jira/browse/SOLR-1538
: https://issues.apache.org/jira/browse/SOLR-1558
: https://issues.apache.org/jira/browse/SOLR-1563
: https://issues.apache.org/jira/browse/SOLR-1579
: https://issues.apache.org/jira/browse/SOLR-1580
: https://issues.apache.org/jira/browse/SOLR-1582
: https://issues.apache.org/jira/browse/SOLR-1596
: https://issues.apache.org/jira/browse/SOLR-1651

   https://issues.apache.org/jira/browse/SOLR-1934


-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

   

I'll backport the following soon if there is no objections:

* SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and Readers
retrieved from ContentStreams are not closed in various places, resulting
in file descriptor leaks.
(Christoff Brill, Mark Miller)

Koji

--
http://www.rondhuit.com/en/



[jira] Updated: (SOLR-1747) DumpRequestHandler doesn't close Stream

2010-05-30 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1747:
-

Fix Version/s: 1.4.1

Committed revision 949651.

backported to 1.4 branch for 1.4.1


 DumpRequestHandler doesn't close Stream
 ---

 Key: SOLR-1747
 URL: https://issues.apache.org/jira/browse/SOLR-1747
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 1.4.1, 1.5, 3.1, 4.0

 Attachments: SOLR-1747.patch


 {code}
 stream.add( stream, IOUtils.toString( content.getStream() ) );
 {code}
 IOUtils.toString won't close the stream for you.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1746) CommonsHttpSolrServer passes a ContentStream reader to IOUtils.copy, but doesnt close it.

2010-05-30 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1746:
-

Fix Version/s: 1.4.1

Committed revision 949651.

backported to 1.4 branch for 1.4.1


 CommonsHttpSolrServer passes a ContentStream reader to IOUtils.copy, but 
 doesnt close it.
 -

 Key: SOLR-1746
 URL: https://issues.apache.org/jira/browse/SOLR-1746
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 1.4.1, 1.5, 3.1, 4.0


 IOUtils.copy will not close your reader for you:
 {code}
 @Override
 protected void sendData(OutputStream out)
 throws IOException {
   IOUtils.copy(c.getReader(), out);
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1745) MoreLikeThisHandler gets a Reader from a ContentStream and doesn't close it

2010-05-30 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1745:
-

Fix Version/s: 1.4.1

Committed revision 949651.

backported to 1.4 branch for 1.4.1


 MoreLikeThisHandler gets a Reader from a ContentStream and doesn't close it
 ---

 Key: SOLR-1745
 URL: https://issues.apache.org/jira/browse/SOLR-1745
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 1.4.1, 1.5, 3.1, 4.0

 Attachments: SOLR-1745.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1744) Streams retrieved from ContenStream#getStream are not always closed

2010-05-30 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1744:
-

Fix Version/s: 1.4.1

Committed revision 949651.

backported to 1.4 branch for 1.4.1


 Streams retrieved from ContenStream#getStream are not always closed
 ---

 Key: SOLR-1744
 URL: https://issues.apache.org/jira/browse/SOLR-1744
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 1.4.1, 1.5, 3.1, 4.0

 Attachments: SOLR-1744.patch


 Doesn't look like BinaryUpdateRequestHandler or CommonsHttpSolrServer close 
 streams.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Push for a Solr 1.4.1 Bug Fix Release?

2010-05-30 Thread Bill Au
+1

I can help test any RC too.

Bill

On Sun, May 30, 2010 at 7:03 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:

  (10/05/30 14:08), Chris Hostetter wrote:

 FYI...

 : ## 9 Bugs w/fixes on the 1.5 branch that seem serious enough
 : ## that they warrant a 1.4.1 bug-fix release...

 ...those 9 bugs have been merged to branch-1.4.  I'll work on the
 remainders listed below (which includes upgrading the lucene jars) tomorow
 or monday

 : https://issues.apache.org/jira/browse/SOLR-1522
 : https://issues.apache.org/jira/browse/SOLR-1538
 : https://issues.apache.org/jira/browse/SOLR-1558
 : https://issues.apache.org/jira/browse/SOLR-1563
 : https://issues.apache.org/jira/browse/SOLR-1579
 : https://issues.apache.org/jira/browse/SOLR-1580
 : https://issues.apache.org/jira/browse/SOLR-1582
 : https://issues.apache.org/jira/browse/SOLR-1596
 : https://issues.apache.org/jira/browse/SOLR-1651

   https://issues.apache.org/jira/browse/SOLR-1934


 -Hoss


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

I'll backport the following soon if there is no objections:

 * SOLR-1748, SOLR-1747, SOLR-1746, SOLR-1745, SOLR-1744: Streams and
 Readers
 retrieved from ContentStreams are not closed in various places, resulting
 in file descriptor leaks.
 (Christoff Brill, Mark Miller)

 Koji

 -- http://www.rondhuit.com/en/




[jira] Updated: (LUCENE-2481) Enhance SnapshotDeletionPolicy to allow taking multiple snapshots

2010-05-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2481:
---

Attachment: LUCENE-2481-3x.patch

Some javadoc updates and member renames (as Mike suggested). I plan to commit 
this shortly.

 Enhance SnapshotDeletionPolicy to allow taking multiple snapshots
 -

 Key: LUCENE-2481
 URL: https://issues.apache.org/jira/browse/LUCENE-2481
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2481-3x.patch, LUCENE-2481-3x.patch


 A spin off from here: 
 http://www.gossamer-threads.com/lists/lucene/java-dev/99161?do=post_view_threaded#99161
 I will:
 # Replace snapshot() with snapshot(String), so that one can name/identify the 
 snapshot
 # Add some supporting methods, like release(String), getSnapshots() etc.
 # Some unit tests of course.
 This is mostly written already - I want to contribute it. I've also written a 
 PersistentSDP, which persists the snapshots on stable storage (a Lucene index 
 in this case) to support opening an IW with existing snapshots already, so 
 they don't get deleted. If it's interesting, I can contribute it as well.
 Porting my patch to the new API. Should post it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org