date:20110112

[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981160#action_12981160
 ] 

Shai Erera commented on LUCENE-2863:


If you want to update documents, you should store them in their entirety 
somewhere (either in a Lucene index as stored fields, all of them), a DB or 
someplace else. This is how updateDocument currently works.

> Updating a documenting looses its fields that only indexed, also NumericField 
> tries are completely lost
> ---
>
> Key: LUCENE-2863
> URL: https://issues.apache.org/jira/browse/LUCENE-2863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 3.0.2, 3.0.3
> Environment: WindowsXP, Java1.6.20 using a RamDirectory
>Reporter: Tamas Sandor
>Priority: Blocker
>
> I have a code snippet (see below) which creates a new document with standard 
> (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then 
> it updates the document via adding a new string field. The result is that all 
> the fields that are not stored but indexed-only and especially NumericFields 
> the trie tokens are completly lost from index after update or delete/add.
> {code:java}
> Directory ramDir = new RamDirectory();
> IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
> MaxFieldLength.UNLIMITED);
> Document doc = new Document();
> doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(51.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-0.08913399651646614d));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(101.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-100.08913399651646614d));
> writer.addDocument(doc);
> Term t = new Term("ID", "HO1234");
> Query q = new TermQuery(t);
> IndexSearcher seacher = new IndexSearcher(writer.getReader());
> TopDocs hits = seacher.search(q, 1);
> if (hits.scoreDocs.length > 0) {
>   Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
>   ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
> Index.NOT_ANALYZED_NO_NORMS));
>   writer.updateDocument(t, ndoc);
> //  writer.deleteDocuments(q);
> //  writer.addDocument(ndoc);
> } else {
>   LOG.info("Couldn't find the document via the query");
> }
> seacher = new IndexSearcher(writer.getReader());
> hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
> LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0
> writer.close();
> {code}
> And I have a boundingbox query based on *NumericRangeQuery*. After the 
> document update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Closed: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera closed LUCENE-2863.
--

Resolution: Not A Problem

This is not the sort of discussions we should be having in JIRA - that's why we 
have the user list. Closing as it's not a bug, nor a feature/enhancement 
proposal.

> Updating a documenting looses its fields that only indexed, also NumericField 
> tries are completely lost
> ---
>
> Key: LUCENE-2863
> URL: https://issues.apache.org/jira/browse/LUCENE-2863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 3.0.2, 3.0.3
> Environment: WindowsXP, Java1.6.20 using a RamDirectory
>Reporter: Tamas Sandor
>Priority: Blocker
>
> I have a code snippet (see below) which creates a new document with standard 
> (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then 
> it updates the document via adding a new string field. The result is that all 
> the fields that are not stored but indexed-only and especially NumericFields 
> the trie tokens are completly lost from index after update or delete/add.
> {code:java}
> Directory ramDir = new RamDirectory();
> IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
> MaxFieldLength.UNLIMITED);
> Document doc = new Document();
> doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(51.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-0.08913399651646614d));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(101.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-100.08913399651646614d));
> writer.addDocument(doc);
> Term t = new Term("ID", "HO1234");
> Query q = new TermQuery(t);
> IndexSearcher seacher = new IndexSearcher(writer.getReader());
> TopDocs hits = seacher.search(q, 1);
> if (hits.scoreDocs.length > 0) {
>   Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
>   ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
> Index.NOT_ANALYZED_NO_NORMS));
>   writer.updateDocument(t, ndoc);
> //  writer.deleteDocuments(q);
> //  writer.addDocument(ndoc);
> } else {
>   LOG.info("Couldn't find the document via the query");
> }
> seacher = new IndexSearcher(writer.getReader());
> hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
> LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0
> writer.close();
> {code}
> And I have a boundingbox query based on *NumericRangeQuery*. After the 
> document update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

2011-01-12 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981156#action_12981156
 ] 

Simon Willnauer commented on LUCENE-2831:
-

I committed the latest patch in revision 1058431, I think we are done here - 
yay!

> Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
> -
>
> Key: LUCENE-2831
> URL: https://issues.apache.org/jira/browse/LUCENE-2831
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, 
> LUCENE-2831-nuke-SolrIndexReader.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch
>
>
> Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, 
> boolean, boolean) we should / could revise the API and pass in a struct that 
> has parent reader, sub reader, ord of that sub. The ord mapping plus the 
> context with its parent would make several issues way easier. See 
> LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Tamas Sandor (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981154#action_12981154
 ] 

Tamas Sandor commented on LUCENE-2863:
--

Yeah, but how can I add the indexed fields back (tries of _LAT_, _LNG_ and the 
_PATTERN_ field)?
{{document.getFields()}} would give my old fields back in the form on 
{{List}} but the comment says:
{quote}
Note that fields which are not stored are not available in documents retrieved 
from the index, e.g. Searcher.doc(int) or IndexReader.document(int).
{quote}
So this won't work either:
{code:java}
doc = searcher.doc(hits.scoreDocs[0].doc);
Document ndoc = new Document();
for (Fieldable field : doc.getFields()) {
ndoc.add(field);
}
ndoc.add(new Field("FINAL", "FINAL", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
writer.updateDocument(t, ndoc);
{code}

> Updating a documenting looses its fields that only indexed, also NumericField 
> tries are completely lost
> ---
>
> Key: LUCENE-2863
> URL: https://issues.apache.org/jira/browse/LUCENE-2863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 3.0.2, 3.0.3
> Environment: WindowsXP, Java1.6.20 using a RamDirectory
>Reporter: Tamas Sandor
>Priority: Blocker
>
> I have a code snippet (see below) which creates a new document with standard 
> (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then 
> it updates the document via adding a new string field. The result is that all 
> the fields that are not stored but indexed-only and especially NumericFields 
> the trie tokens are completly lost from index after update or delete/add.
> {code:java}
> Directory ramDir = new RamDirectory();
> IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
> MaxFieldLength.UNLIMITED);
> Document doc = new Document();
> doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(51.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-0.08913399651646614d));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(101.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-100.08913399651646614d));
> writer.addDocument(doc);
> Term t = new Term("ID", "HO1234");
> Query q = new TermQuery(t);
> IndexSearcher seacher = new IndexSearcher(writer.getReader());
> TopDocs hits = seacher.search(q, 1);
> if (hits.scoreDocs.length > 0) {
>   Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
>   ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
> Index.NOT_ANALYZED_NO_NORMS));
>   writer.updateDocument(t, ndoc);
> //  writer.deleteDocuments(q);
> //  writer.addDocument(ndoc);
> } else {
>   LOG.info("Couldn't find the document via the query");
> }
> seacher = new IndexSearcher(writer.getReader());
> hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
> LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0
> writer.close();
> {code}
> And I have a boundingbox query based on *NumericRangeQuery*. After the 
> document update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Solr-3.x - Build # 226 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Solr-3.x/226/

All tests passed

Build Log (for compile errors):
[...truncated 20277 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

2011-01-12 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-445:


Attachment: solr-445.xml
SOLR-445.patch

Here's a cut at an improvement at least.

The attached XML file contains an  packet with a number of documents 
illustrating a number of errors. The xml file can be POSTed Solr to index via 
the post.jar file so you can see the output.

This patch attempts to report back to the user the following for each document 
that failed:
1> the ordinal position in the file where the error occurred (e.g. the first, 
second, etc  tag).
2> the  if available.
3> the error.

The general idea is to accrue the errors in a StringBuilder and eventually 
re-throw the error after processing as far as possible.

Issues:
1> the reported format in the log file is kind of hard to read. I 
pipe-delimited the various  tags, but they run together in a Windows DOS 
window. What happens on Unix I'm not quite sure. Suggestions welcome.
2> From the original post, rolling this back will be tricky. Very tricky. The 
autocommit feature makes it indeterminate what's been committed to the index, 
so I don't know how to even approach rolling back everything.
3> The intent here is to give the user a clue where to start when figuring out 
what document(s) failed so they don't have to guess.
4> Tests fail, but I have no clue why. I checked out a new copy of trunk and 
that fails as well, so I don't think that this patch is the cause of the 
errors. But let's not commit this until we can be sure.
5> What do you think about limiting the number of docs that fail before 
quitting? One could imagine some ratio (say 10%) that have to fail before 
quitting (with some safeguards, like don't bother calculating the ratio until 
20 docs had been processed or...). Or an absolute number. Should this be a 
parameter? Or hard-coded? The assumption here is that if 10 (or 100 or..) docs 
fail, there's something pretty fundamentally wrong and it's a waste to keep on. 
I don't have any strong feeling here, I can argue it either way
6> Sorry, all, but I reflexively hit the reformat keystrokes so the raw patch 
may be hard to read. But I'm pretty well in the camp that you *have* to 
reformat as you go or the code will be held hostage to the last person who 
*didn't* format properly. I'm pretty sure I'm using the right codestyle.xml 
file, but let me know if not.
7> I doubt that this has any bearing on, say, SolrJ indexing. Should that be 
another bug (or is there one already)? Anybody got a clue where I'd look for 
that since I'm in the area anyway?

Erick

> XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
> 
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.3
>Reporter: Will Johnson
>Assignee: Erick Erickson
> Fix For: Next
>
> Attachments: SOLR-445.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> 
>   
> 1
>   
>   
> 2
> I_AM_A_BAD_DATE
>   
>   
> 3
>   
> 
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-trunk - Build # 1424 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1424/

All tests passed

Build Log (for compile errors):
[...truncated 16732 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Assigned: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

2011-01-12 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-445:
---

Assignee: Erick Erickson

> XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
> 
>
> Key: SOLR-445
> URL: https://issues.apache.org/jira/browse/SOLR-445
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.3
>Reporter: Will Johnson
>Assignee: Erick Erickson
> Fix For: Next
>
>
> Has anyone run into the problem of handling bad documents / failures mid 
> batch.  Ie:
> 
>   
> 1
>   
>   
> 2
> I_AM_A_BAD_DATE
>   
>   
> 3
>   
> 
> Right now solr adds the first doc and then aborts.  It would seem like it 
> should either fail the entire batch or log a message/return a code and then 
> continue on to add doc 3.  Option 1 would seem to be much harder to 
> accomplish and possibly require more memory while Option 2 would require more 
> information to come back from the API.  I'm about to dig into this but I 
> thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981100#action_12981100
 ] 

Koji Sekiguchi commented on SOLR-2282:
--

Thanks, Robert! I committed the fix. (Still I couldn't reproduce the hudson 
problem on my mac if I comment out @Ignore in 
DistributedClusteringComponentTest.java.)

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2723) Speed up Lucene's low level bulk postings read API

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981083#action_12981083
 ] 

Robert Muir commented on LUCENE-2723:
-

I merged us up to yesterday (1052991:1057836), but stopped at the Pulsing codec 
rewrite :)

Mike can you assist in merging r1057897? Besides requiring a lot of beer there 
is a 
danger of screwing it up, since we have to re-implement its bulk postings enum.


> Speed up Lucene's low level bulk postings read API
> --
>
> Key: LUCENE-2723
> URL: https://issues.apache.org/jira/browse/LUCENE-2723
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-2723-termscorer.patch, 
> LUCENE-2723-termscorer.patch, LUCENE-2723-termscorer.patch, 
> LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, 
> LUCENE-2723.patch, LUCENE-2723_bulkvint.patch, LUCENE-2723_facetPerSeg.patch, 
> LUCENE-2723_facetPerSeg.patch, LUCENE-2723_openEnum.patch, 
> LUCENE-2723_termscorer.patch, LUCENE-2723_wastedint.patch
>
>
> Spinoff from LUCENE-1410.
> The flex DocsEnum has a simple bulk-read API that reads the next chunk
> of docs/freqs.  But it's a poor fit for intblock codecs like FOR/PFOR
> (from LUCENE-1410).  This is not unlike sucking coffee through those
> tiny plastic coffee stirrers they hand out airplanes that,
> surprisingly, also happen to function as a straw.
> As a result we see no perf gain from using FOR/PFOR.
> I had hacked up a fix for this, described at in my blog post at
> http://chbits.blogspot.com/2010/08/lucene-performance-with-pfordelta-codec.html
> I'm opening this issue to get that work to a committable point.
> So... I've worked out a new bulk-read API to address performance
> bottleneck.  It has some big changes over the current bulk-read API:
>   * You can now also bulk-read positions (but not payloads), but, I
>  have yet to cutover positional queries.
>   * The buffer contains doc deltas, not absolute values, for docIDs
> and positions (freqs are absolute).
>   * Deleted docs are not filtered out.
>   * The doc & freq buffers need not be "aligned".  For fixed intblock
> codecs (FOR/PFOR) they will be, but for varint codecs (Simple9/16,
> Group varint, etc.) they won't be.
> It's still a work in progress...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2314) replicate/index.jsp UI does not honor java system properties enable.master, enable.slave

2011-01-12 Thread will milspec (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981080#action_12981080
 ] 

will milspec commented on SOLR-2314:


Correction:
Where the bug says "-Dsolr.enable.master" it should have said  
'-Denable.master'  (similarly for slave)



> replicate/index.jsp UI does not honor java system properties enable.master, 
> enable.slave
> 
>
> Key: SOLR-2314
> URL: https://issues.apache.org/jira/browse/SOLR-2314
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.4.1
> Environment: jdk 1.6.0.23  ; both jetty and jboss/tomcat. 
>Reporter: will milspec
>Priority: Minor
>
> Summary:
> ==
> - Admin UI replication/index.jsp checks for master or slave with the 
> following code:
>if ("true".equals(detailsMap.get("isSlave"))) 
> -  if slave, replication/index.jsp displays the "Master" and "Poll 
> Intervals", etc. sections (everything up to "Cores")
> - if false, replication/index.jsp does not display the "Master", "Poll 
> Intervals" sections 
> -This "slave check/UI difference" works correctly if the solrconfig.xml has a 
>  "slave" but not "master" section or vice versa
> Expected results:
> ==
> Same UI difference would occur in the following scenario:
>a) solrconfig.xml has both master and slave entries
>b) use java.properties (-Dsolr.enable.master -Dsolr.enable.slave) to set 
> "master" or "slave" at runtime
> *OR*
> c) use solrcore.properties  to set "master" and "slave" at runtime
> Actual results:
> ==
> If solrconfig.xml has both master and slave entries, replication/index.jsp 
> shows both "master" and "slave" section regardless of system.properties or 
> solrcore.properties

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2314) replicate/index.jsp UI does not honor java system properties enable.master, enable.slave

2011-01-12 Thread will milspec (JIRA)

replicate/index.jsp UI does not honor java system properties enable.master, 
enable.slave


 Key: SOLR-2314
 URL: https://issues.apache.org/jira/browse/SOLR-2314
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 1.4.1
 Environment: jdk 1.6.0.23  ; both jetty and jboss/tomcat. 
Reporter: will milspec
Priority: Minor


Summary:
==
- Admin UI replication/index.jsp checks for master or slave with the following 
code:
   if ("true".equals(detailsMap.get("isSlave"))) 
-  if slave, replication/index.jsp displays the "Master" and "Poll Intervals", 
etc. sections (everything up to "Cores")
- if false, replication/index.jsp does not display the "Master", "Poll 
Intervals" sections 
-This "slave check/UI difference" works correctly if the solrconfig.xml has a  
"slave" but not "master" section or vice versa

Expected results:
==
Same UI difference would occur in the following scenario:
   a) solrconfig.xml has both master and slave entries
   b) use java.properties (-Dsolr.enable.master -Dsolr.enable.slave) to set 
"master" or "slave" at runtime

*OR*
c) use solrcore.properties  to set "master" and "slave" at runtime

Actual results:
==
If solrconfig.xml has both master and slave entries, replication/index.jsp 
shows both "master" and "slave" section regardless of system.properties or 
solrcore.properties

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2846) omitTF is viral, but omitNorms is anti-viral.

2011-01-12 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-2846.
-

Resolution: Fixed

Committed revision 1058367.

I deprecated the dangerous setNorm(float) method in 3.x in revision 1058370,
instead pointing at setNorm(byte) and using Similarity.encodeNormValue(),
so you can ensure your Similarity is always used (not Similarity.getDefault)


> omitTF is viral, but omitNorms is anti-viral.
> -
>
> Key: LUCENE-2846
> URL: https://issues.apache.org/jira/browse/LUCENE-2846
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2846.patch, LUCENE-2846.patch, LUCENE-2846.patch, 
> LUCENE-2846.patch
>
>
> omitTF is viral. if you add document 1 with field "foo" as omitTF, then 
> document 2 has field "foo" without omitTF, they are both treated as omitTF.
> but omitNorms is the opposite. if you have a million documents with field 
> "foo" with omitNorms, then you add just one document without omitting norms, 
> now you suddenly have a million 'real norms'.
> I think it would be good for omitNorms to be viral too, just for consistency, 
> and also to prevent huge byte[]'s.
> but another option is to make omitTF anti-viral, which is more "schemaless" i 
> guess.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-3.x - Build # 240 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-3.x/240/

All tests passed

Build Log (for compile errors):
[...truncated 21057 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-12 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981019#action_12981019
 ] 

Jason Rutherglen commented on LUCENE-2856:
--

I separated out a ReaderListener because it's tied to the ReaderPool which 
eventually will exist external to IW.  

> Create IndexWriter event listener, specifically for merges
> --
>
> Key: LUCENE-2856
> URL: https://issues.apache.org/jira/browse/LUCENE-2856
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
> Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch
>
>
> The issue will allow users to monitor merges occurring within IndexWriter 
> using a callback notifier event listener.  This can be used by external 
> applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[ACCOUNCE] Lucene.Net proposal submitted

2011-01-12 Thread Troy Howard

All,

This is a brief note to announce that the Lucene.Net proposal to move
back to the Incubator was submitted today on the
gene...@incubator.apache.org mailing list.

To follow the discussion, please sign up for that mailing list or view
the archives (with slight delay) at:

http://mail-archives.apache.org/mod_mbox/incubator-general/

Thanks,
Troy

Re: call python from java - what strategy do you use?

2011-01-12 Thread Andi Vajda



On Wed, 12 Jan 2011, Roman Chyla wrote:


Oh, nevermind I found it out:

It is loaded by python, so PYTHONPATH or other ways must be used.
Also i had to change exports:
inside sorlpie_java/__init__.py
i added line:

from emql import Emql

Then, in java i can do:

   PythonVM vm = PythonVM.start("sorlpie_java");
EMQL em = (EMQL)vm.instantiate("solrpie_java", "Emql");


Yep, that's the one !

Andi..



em.javaTestPrint();
em.pythonTestPrint();
System.out.println(em.emql_status());

And I get:

java is printing
some status

funny the pythonTestPrint() never prints anything


Cheers,

  roman



On Wed, Jan 12, 2011 at 11:20 PM, Roman Chyla  wrote:

Hi Andi,

Thanks for the help, now I was able to run the java and loaded
PythonVM. I then built the python egg, after a bit of fiddling with
parameters, it seems ok. I can import the jcc wrapped python class and
call it:

In [1]: from solrpie_java import emql

In [2]: em = emql.Emql()

In [3]: em.javaTestPrint()
java is printing

In [4]: em.pythonTestPrint()
just a test

But I haven't found out how to call the same from java.

The egg is built fine, it is named solrpie_java and contains one python module:

==

from solrpie_java import initVM, CLASSPATH, EMQL

initVM(CLASSPATH)


class Emql(EMQL):
   '''
   classdocs
   '''

   def __init__(self):
       super(Emql, self).__init__()
       print '__init__'


   def init(self, me):
       print self, me
       return 'init'
   def emql_refresh(self, tid, type):
       print self, tid, type
       return 'refresh'
   def emql_status(self):
       return "some status"

   def pythonTestPrint(self):
       print 'just a test'


The corresponding java class looks like this:


public class EMQL {

          private long pythonObject;

          public EMQL()
          {
          }

          public void pythonExtension(long pythonObject)
          {
              this.pythonObject = pythonObject;
          }
          public long pythonExtension()
          {
              return this.pythonObject;
          }

          public void finalize()
              throws Throwable
          {
              pythonDecRef();
          }

          public void javaTestPrint() {
                  System.out.println("java is printing");
          }

          public native void pythonDecRef();

          // the methods implemented in python
          public native String init(EMQL me);
          public native String emql_refresh(String tid, String type);
          public native String emql_status();

          public native void pythonTestPrint();


}

===

I tried running it as:

               PythonVM vm = PythonVM.start("sorlpie_java");
               EMQL em = new EMQL();
               em.javaTestPrint();
               em.pythonTestPrint();

I get this:

java is printing
Exception in thread "main" java.lang.UnsatisfiedLinkError:
rca.pythonvm.EMQL.pythonTestPrint()V
       at rca.pythonvm.EMQL.pythonTestPrint(Native Method)
       at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:60)
       at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148)

I understand that java cannot find the linked c++ method, but I don't
know how to fix that.
If i try:

PythonVM vm = PythonVM.start("sorlpie_java");
Object m = vm.instantiate("emql", "Emql");

I get:

org.apache.jcc.PythonException: No module named emql
ImportError: No module named emql

       at org.apache.jcc.PythonVM.instantiate(Native Method)
       at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:56)
       at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148)

I tried various combinations of instanatiation, and setting the
classpatt or -Djava.library.path
But no success. What am I doing wrong?

Thank you,

 roman



On Wed, Jan 12, 2011 at 7:55 PM, Andi Vajda  wrote:


On Wed, 12 Jan 2011, Roman Chyla wrote:


Hi Andi, all,

I tried to implement the PythonVM wrapping on Mac 10.6, with JDK
1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is
the standard Python distributed with MacOsX

When I try to run the java, it throws an error when it gets to:

static {
      System.loadLibrary("jcc");
  }

I am getting this error:

Exception in thread "main" java.lang.UnsatisfiedLinkError:

/Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib:
Symbol not found: _PyExc_RuntimeError   Referenced from:


That's because Python's shared library wasn't found. The reason is that, by
default, Python's shared lib not on JCC's link line because normally JCC is
loaded into a Python process and the dynamic linker thus finds the symbols
needed inside the process.

Here, since you're not starting inside a Python process, you need to add
'-framework Python' to JCC's LFLAGS in setup.py so that the dynamic linker
can find the Python VM shared lib and load it.

Andi..



/Library/Python/2.6/site-packages/JCC-2

Re: call python from java - what strategy do you use?

2011-01-12 Thread Andi Vajda



 Hi Roman,

On Wed, 12 Jan 2011, Roman Chyla wrote:


Thanks for the help, now I was able to run the java and loaded
PythonVM. I then built the python egg, after a bit of fiddling with
parameters, it seems ok. I can import the jcc wrapped python class and
call it:

In [1]: from solrpie_java import emql


Why are you calling your class EMQL ? (this name was just an example culled 
from my code).



In [2]: em = emql.Emql()

In [3]: em.javaTestPrint()
java is printing

In [4]: em.pythonTestPrint()
just a test

But I haven't found out how to call the same from java.


Ah, yes, I forgot to tell you how to pull that in.
In Java, you import that 'EMQL' java class and instantiate it by way of the 
PythonVM instance's instantiate() call:


import org.blah.blah.EMQL;
import org.apache.jcc.PythonVM;

.

PythonVM vm = PythonVM.get();

emql = (EMQL) vm.instantiate("jemql.emql", "emql");
... call method on emql instance just created ...

The instantiate("foo", "bar") method in effect asks Python to run
  "from foo import bar"
  "return bar()"

Andi..

Re: call python from java - what strategy do you use?

2011-01-12 Thread Roman Chyla

Oh, nevermind I found it out:

It is loaded by python, so PYTHONPATH or other ways must be used.
Also i had to change exports:
inside sorlpie_java/__init__.py
i added line:

from emql import Emql

Then, in java i can do:

PythonVM vm = PythonVM.start("sorlpie_java");
EMQL em = (EMQL)vm.instantiate("solrpie_java", "Emql");

em.javaTestPrint();
em.pythonTestPrint();
System.out.println(em.emql_status());

And I get:

java is printing
some status

funny the pythonTestPrint() never prints anything


Cheers,

   roman



On Wed, Jan 12, 2011 at 11:20 PM, Roman Chyla  wrote:
> Hi Andi,
>
> Thanks for the help, now I was able to run the java and loaded
> PythonVM. I then built the python egg, after a bit of fiddling with
> parameters, it seems ok. I can import the jcc wrapped python class and
> call it:
>
> In [1]: from solrpie_java import emql
>
> In [2]: em = emql.Emql()
>
> In [3]: em.javaTestPrint()
> java is printing
>
> In [4]: em.pythonTestPrint()
> just a test
>
> But I haven't found out how to call the same from java.
>
> The egg is built fine, it is named solrpie_java and contains one python 
> module:
>
> ==
>
> from solrpie_java import initVM, CLASSPATH, EMQL
>
> initVM(CLASSPATH)
>
>
> class Emql(EMQL):
>    '''
>    classdocs
>    '''
>
>    def __init__(self):
>        super(Emql, self).__init__()
>        print '__init__'
>
>
>    def init(self, me):
>        print self, me
>        return 'init'
>    def emql_refresh(self, tid, type):
>        print self, tid, type
>        return 'refresh'
>    def emql_status(self):
>        return "some status"
>
>    def pythonTestPrint(self):
>        print 'just a test'
>
> 
> The corresponding java class looks like this:
>
>
> public class EMQL {
>
>           private long pythonObject;
>
>           public EMQL()
>           {
>           }
>
>           public void pythonExtension(long pythonObject)
>           {
>               this.pythonObject = pythonObject;
>           }
>           public long pythonExtension()
>           {
>               return this.pythonObject;
>           }
>
>           public void finalize()
>               throws Throwable
>           {
>               pythonDecRef();
>           }
>
>           public void javaTestPrint() {
>                   System.out.println("java is printing");
>           }
>
>           public native void pythonDecRef();
>
>           // the methods implemented in python
>           public native String init(EMQL me);
>           public native String emql_refresh(String tid, String type);
>           public native String emql_status();
>
>           public native void pythonTestPrint();
>
>
> }
>
> ===
>
> I tried running it as:
>
>                PythonVM vm = PythonVM.start("sorlpie_java");
>                EMQL em = new EMQL();
>                em.javaTestPrint();
>                em.pythonTestPrint();
>
> I get this:
>
> java is printing
> Exception in thread "main" java.lang.UnsatisfiedLinkError:
> rca.pythonvm.EMQL.pythonTestPrint()V
>        at rca.pythonvm.EMQL.pythonTestPrint(Native Method)
>        at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:60)
>        at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148)
>
> I understand that java cannot find the linked c++ method, but I don't
> know how to fix that.
> If i try:
>
> PythonVM vm = PythonVM.start("sorlpie_java");
> Object m = vm.instantiate("emql", "Emql");
>
> I get:
>
> org.apache.jcc.PythonException: No module named emql
> ImportError: No module named emql
>
>        at org.apache.jcc.PythonVM.instantiate(Native Method)
>        at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:56)
>        at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148)
>
> I tried various combinations of instanatiation, and setting the
> classpatt or -Djava.library.path
> But no success. What am I doing wrong?
>
> Thank you,
>
>  roman
>
>
>
> On Wed, Jan 12, 2011 at 7:55 PM, Andi Vajda  wrote:
>>
>> On Wed, 12 Jan 2011, Roman Chyla wrote:
>>
>>> Hi Andi, all,
>>>
>>> I tried to implement the PythonVM wrapping on Mac 10.6, with JDK
>>> 1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is
>>> the standard Python distributed with MacOsX
>>>
>>> When I try to run the java, it throws an error when it gets to:
>>>
>>> static {
>>>       System.loadLibrary("jcc");
>>>   }
>>>
>>> I am getting this error:
>>>
>>> Exception in thread "main" java.lang.UnsatisfiedLinkError:
>>>
>>> /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib:
>>> Symbol not found: _PyExc_RuntimeError   Referenced from:
>>
>> That's because Python's shared library wasn't found. The reason is that, by
>> default, Python's shared lib not on JCC's link line because normally JCC is
>> loaded into a Python process and the dynamic linker thus finds the symbols
>> needed inside th

[jira] Assigned: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-2312:
-

Assignee: Mark Miller

> CloudSolrServer -- calling add(Collection docs) throws NPE.
> --
>
> Key: SOLR-2312
> URL: https://issues.apache.org/jira/browse/SOLR-2312
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
> Environment: Mac OSX  v10.5.8
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>Reporter: Stan Burnitt
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 4.0
>
>
> Cannot index documents using the CloudSolrServer.
> Below is a code snippet that reproduces the error.
> {code:borderStyle=solid}
> @Test
> public void jiraTestCase() {
>   CloudSolrServer solrj = null;
>
>   try {
>   solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181");
>   // Also tried creating CloudSolrServer using 
> alternative contstuctor below...
>   // public CloudSolrServer(String zkHost, 
> LBHttpSolrServer lbServer)
>   //
>   // LBHttpSolrServer lbHttpSolrServer = new 
> LBHttpSolrServer("http://solr.localdomain:8983/solr";);
>   // solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
>   //
>   // (Same result -- NPE @ line 105 in 
> CloudSolrServer.java)
>   solrj.setDefaultCollection("your-collection");
>   solrj.setZkClientTimeout(5000);
>   solrj.setZkConnectTimeout(5000);
>   final Collection batch = new 
> ArrayList();
>   SolrInputDocument doc = new SolrInputDocument();
>   doc.addField("id", 1L, 1.0f);
>   doc.addField("title", "Document A");
>   doc.addField("description", "Test document");
>   batch.add(doc);
>   doc = new SolrInputDocument();
>   doc.addField("id", 2L, 1.0f);
>   doc.addField("title", "Document B");
>   doc.addField("description", "Another test 
> document");
>   batch.add(doc);
>   solrj.add(batch);
>   } catch (Exception e) {
>   log.error(e.getMessage(), e);
>   Assert.fail("java.lang.NullPointerException: 
> null \n"
>   + " at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
>  \n"
>   + " Line 105:  NULL request object here 
> --> String collection = request.getParams().get(\"collection\", 
> defaultCollection);");
>   } finally {
>   solrj.close();
>   }
> }
> {code} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Earwin Burrfoot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980965#action_12980965
 ] 

Earwin Burrfoot commented on LUCENE-2863:
-

updateDocument() is an atomic version of deleteDocument() + addDocument(), 
nothing more

and there's nothing surprising you lose your fields if you delete the doc and 
don't add them back later.

> Updating a documenting looses its fields that only indexed, also NumericField 
> tries are completely lost
> ---
>
> Key: LUCENE-2863
> URL: https://issues.apache.org/jira/browse/LUCENE-2863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 3.0.2, 3.0.3
> Environment: WindowsXP, Java1.6.20 using a RamDirectory
>Reporter: Tamas Sandor
>Priority: Blocker
>
> I have a code snippet (see below) which creates a new document with standard 
> (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then 
> it updates the document via adding a new string field. The result is that all 
> the fields that are not stored but indexed-only and especially NumericFields 
> the trie tokens are completly lost from index after update or delete/add.
> {code:java}
> Directory ramDir = new RamDirectory();
> IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
> MaxFieldLength.UNLIMITED);
> Document doc = new Document();
> doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(51.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-0.08913399651646614d));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(101.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-100.08913399651646614d));
> writer.addDocument(doc);
> Term t = new Term("ID", "HO1234");
> Query q = new TermQuery(t);
> IndexSearcher seacher = new IndexSearcher(writer.getReader());
> TopDocs hits = seacher.search(q, 1);
> if (hits.scoreDocs.length > 0) {
>   Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
>   ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
> Index.NOT_ANALYZED_NO_NORMS));
>   writer.updateDocument(t, ndoc);
> //  writer.deleteDocuments(q);
> //  writer.addDocument(ndoc);
> } else {
>   LOG.info("Couldn't find the document via the query");
> }
> seacher = new IndexSearcher(writer.getReader());
> hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
> LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0
> writer.close();
> {code}
> And I have a boundingbox query based on *NumericRangeQuery*. After the 
> document update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Tamas Sandor (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Sandor updated LUCENE-2863:
-

Priority: Blocker  (was: Major)

> Updating a documenting looses its fields that only indexed, also NumericField 
> tries are completely lost
> ---
>
> Key: LUCENE-2863
> URL: https://issues.apache.org/jira/browse/LUCENE-2863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 3.0.2, 3.0.3
> Environment: WindowsXP, Java1.6.20 using a RamDirectory
>Reporter: Tamas Sandor
>Priority: Blocker
>
> I have a code snippet (see below) which creates a new document with standard 
> (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then 
> it updates the document via adding a new string field. The result is that all 
> the fields that are not stored but indexed-only and especially NumericFields 
> the trie tokens are completly lost from index after update or delete/add.
> {code:java}
> Directory ramDir = new RamDirectory();
> IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
> MaxFieldLength.UNLIMITED);
> Document doc = new Document();
> doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(51.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-0.08913399651646614d));
> writer.addDocument(doc);
> doc = new Document();
> doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
> doc.add(new NumericField("LAT", Store.YES, 
> true).setDoubleValue(101.48826603066d));
> doc.add(new NumericField("LNG", Store.YES, 
> true).setDoubleValue(-100.08913399651646614d));
> writer.addDocument(doc);
> Term t = new Term("ID", "HO1234");
> Query q = new TermQuery(t);
> IndexSearcher seacher = new IndexSearcher(writer.getReader());
> TopDocs hits = seacher.search(q, 1);
> if (hits.scoreDocs.length > 0) {
>   Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
>   ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
> Index.NOT_ANALYZED_NO_NORMS));
>   writer.updateDocument(t, ndoc);
> //  writer.deleteDocuments(q);
> //  writer.addDocument(ndoc);
> } else {
>   LOG.info("Couldn't find the document via the query");
> }
> seacher = new IndexSearcher(writer.getReader());
> hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
> LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0
> writer.close();
> {code}
> And I have a boundingbox query based on *NumericRangeQuery*. After the 
> document update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost

2011-01-12 Thread Tamas Sandor (JIRA)

Updating a documenting looses its fields that only indexed, also NumericField 
tries are completely lost
---

 Key: LUCENE-2863
 URL: https://issues.apache.org/jira/browse/LUCENE-2863
 Project: Lucene - Java
  Issue Type: Bug
  Components: Store
Affects Versions: 3.0.3, 3.0.2
 Environment: WindowsXP, Java1.6.20 using a RamDirectory
Reporter: Tamas Sandor


I have a code snippet (see below) which creates a new document with standard 
(stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then it 
updates the document via adding a new string field. The result is that all the 
fields that are not stored but indexed-only and especially NumericFields the 
trie tokens are completly lost from index after update or delete/add.
{code:java}
Directory ramDir = new RamDirectory();
IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), 
MaxFieldLength.UNLIMITED);
Document doc = new Document();
doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
doc.add(new NumericField("LAT", Store.YES, 
true).setDoubleValue(51.48826603066d));
doc.add(new NumericField("LNG", Store.YES, 
true).setDoubleValue(-0.08913399651646614d));
writer.addDocument(doc);
doc = new Document();
doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS));
doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS));
doc.add(new NumericField("LAT", Store.YES, 
true).setDoubleValue(101.48826603066d));
doc.add(new NumericField("LNG", Store.YES, 
true).setDoubleValue(-100.08913399651646614d));
writer.addDocument(doc);

Term t = new Term("ID", "HO1234");
Query q = new TermQuery(t);
IndexSearcher seacher = new IndexSearcher(writer.getReader());
TopDocs hits = seacher.search(q, 1);
if (hits.scoreDocs.length > 0) {
  Document ndoc = seacher.doc(hits.scoreDocs[0].doc);
  ndoc.add(new Field("FINAL", "FINAL", Store.YES, 
Index.NOT_ANALYZED_NO_NORMS));
  writer.updateDocument(t, ndoc);
//  writer.deleteDocuments(q);
//  writer.addDocument(ndoc);
} else {
  LOG.info("Couldn't find the document via the query");
}

seacher = new IndexSearcher(writer.getReader());
hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1);
LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0

writer.close();
{code}

And I have a boundingbox query based on *NumericRangeQuery*. After the document 
update it doesn't return any hit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2831:


Attachment: LUCENE-2831-nuke-SolrIndexReader.patch

updated to trunk

> Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
> -
>
> Key: LUCENE-2831
> URL: https://issues.apache.org/jira/browse/LUCENE-2831
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, 
> LUCENE-2831-nuke-SolrIndexReader.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch
>
>
> Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, 
> boolean, boolean) we should / could revise the API and pass in a struct that 
> has parent reader, sub reader, ord of that sub. The ord mapping plus the 
> context with its parent would make several issues way easier. See 
> LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2723) Speed up Lucene's low level bulk postings read API

2011-01-12 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980949#action_12980949
 ] 

Simon Willnauer commented on LUCENE-2723:
-

the blocker has been committed - we should merge though!

> Speed up Lucene's low level bulk postings read API
> --
>
> Key: LUCENE-2723
> URL: https://issues.apache.org/jira/browse/LUCENE-2723
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-2723-termscorer.patch, 
> LUCENE-2723-termscorer.patch, LUCENE-2723-termscorer.patch, 
> LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, 
> LUCENE-2723.patch, LUCENE-2723_bulkvint.patch, LUCENE-2723_facetPerSeg.patch, 
> LUCENE-2723_facetPerSeg.patch, LUCENE-2723_openEnum.patch, 
> LUCENE-2723_termscorer.patch, LUCENE-2723_wastedint.patch
>
>
> Spinoff from LUCENE-1410.
> The flex DocsEnum has a simple bulk-read API that reads the next chunk
> of docs/freqs.  But it's a poor fit for intblock codecs like FOR/PFOR
> (from LUCENE-1410).  This is not unlike sucking coffee through those
> tiny plastic coffee stirrers they hand out airplanes that,
> surprisingly, also happen to function as a straw.
> As a result we see no perf gain from using FOR/PFOR.
> I had hacked up a fix for this, described at in my blog post at
> http://chbits.blogspot.com/2010/08/lucene-performance-with-pfordelta-codec.html
> I'm opening this issue to get that work to a committable point.
> So... I've worked out a new bulk-read API to address performance
> bottleneck.  It has some big changes over the current bulk-read API:
>   * You can now also bulk-read positions (but not payloads), but, I
>  have yet to cutover positional queries.
>   * The buffer contains doc deltas, not absolute values, for docIDs
> and positions (freqs are absolute).
>   * Deleted docs are not filtered out.
>   * The doc & freq buffers need not be "aligned".  For fixed intblock
> codecs (FOR/PFOR) they will be, but for varint codecs (Simple9/16,
> Group varint, etc.) they won't be.
> It's still a work in progress...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-2694.
-

Resolution: Fixed

Committed revision 1058328.


> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980941#action_12980941
 ] 

Simon Willnauer commented on LUCENE-2694:
-

bq. Actually I see PK lookups faster - 23 usec w/ patch vs 33 usec w/ trunk 
(per lookup) for 20K lookups.
so I run that on a 32bit machine which is quite slow in general though. I will 
further investigate that on 32bit platform vs. 64 bit. Yet, I only used 1k 
lookups though.

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2694:


Attachment: LUCENE-2694.patch

Here is a final patch, I opened up Terms#getThreadTermsEnum() to reuse 
TermsEnum in PRTE#build().
PRTE#build() now also accepts a boolean if the termlookup should be cached or 
not which makes sense for common TermQuery.

I will commit that shortly - yay!

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980936#action_12980936
 ] 

Dawid Weiss commented on SOLR-2282:
---

Robert, can you somehow check if it's the input that's causing these errors?

SEVERE: java.lang.Error: Error: could not match input

I don't have any idea when such an error could happen, but it doesn't seem to 
be related to concurrency (at first glance).

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-Solr-tests-only-3.x - Build # 3686 - Still Failing

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3686/

1 tests failed.
REGRESSION:  org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety

Error Message:
unable to create new native thread

Stack Trace:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:614)
at 
org.apache.lucene.search.TestThreadSafe.doTest(TestThreadSafe.java:133)
at 
org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety(TestThreadSafe.java:152)
at 
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255)




Build Log (for compile errors):
[...truncated 8582 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980921#action_12980921
 ] 

Michael McCandless commented on LUCENE-2694:


Actually I see PK lookups faster -- 23 usec w/ patch vs 33 usec w/ trunk (per 
lookup) for 20K lookups.

And good speedups on many-term MTQs when I force BQ rewrite:


||Query||QPS base||QPS termstate||Pct diff
|+nebraska +state|169.75|154.64|{color:red}-8.9%{color}|
|doctitle:.*[Uu]nited.*|4.26|4.11|{color:red}-3.5%{color}|
|+unit +state|11.40|11.09|{color:red}-2.7%{color}|
|spanFirst(unit, 5)|17.38|16.93|{color:red}-2.6%{color}|
|spanNear([unit, state], 10, true)|4.37|4.32|{color:red}-1.2%{color}|
|"unit state"~3|4.94|4.89|{color:red}-1.0%{color}|
|"unit state"|8.05|8.03|{color:red}-0.2%{color}|
|state|26.58|26.76|{color:green}0.7%{color}|
|unit state|11.24|11.46|{color:green}1.9%{color}|
|united~2.0|3.87|3.98|{color:green}2.8%{color}|
|doctimesecnum:[1 TO 6]|8.26|8.70|{color:green}5.3%{color}|
|unit~2.0|10.04|10.59|{color:green}5.4%{color}|
|united~1.0|16.84|18.13|{color:green}7.7%{color}|
|unit~1.0|10.09|10.99|{color:green}8.9%{color}|
|un*d|11.96|21.63|{color:green}80.8%{color}|
|unit*|7.60|14.23|{color:green}87.3%{color}|
|u*d|2.22|4.17|{color:green}87.8%{color}|
|uni*|1.83|3.53|{color:green}93.7%{color}|

+1 to commit!

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-Solr-tests-only-3.x - Build # 3685 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3685/

No tests ran.

Build Log (for compile errors):
[...truncated 4447 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-3.x - Build # 238 - Failure

2011-01-12 Thread Michael McCandless

But, how can we make this relatively "private"?

Ie I don't want people searching via Google to stumble on this hudson
summary page showing how flakey our clover build is

Or, can we somehow force hudson to "pass" if all tests passed?

Mike

On Wed, Jan 12, 2011 at 11:19 AM, Robert Muir  wrote:
> On Wed, Jan 12, 2011 at 6:09 AM, Michael McCandless
>  wrote:
>> Can we do something about these "false" failures?
>>
>> They are failing because of this Hudson bug:
>>
>>    http://issues.hudson-ci.org/browse/HUDSON-7836
>>
>> But the highish failure rate makes us look bad when people look at our
>> build stability... which is awful.
>>
>
> One idea would be to divorce the clover, etc from the nightly builds,
> and make a clover build.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SegmentInfo clone

2011-01-12 Thread Jason Rutherglen

> it is set on DocumentsWriter#flush though

Thanks!  I just skip segmentCodecs if it's null, for now.

On Wed, Jan 12, 2011 at 11:05 AM, Simon Willnauer
 wrote:
> On Wed, Jan 12, 2011 at 8:03 PM, Jason Rutherglen
>  wrote:
>> Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs
>> because it's never set (in trunk).
> it is set on DocumentsWriter#flush though
>
> simon
>>
>> On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen
>>  wrote:
>>> Is it intentional that SegmentInfo.segmentCodecs isn't cloned?  When
>>> SI is cloned, then sizeInBytes fails with an NPE.
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-12 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-2856:
-

Attachment: LUCENE-2856.patch

The aborted merge event is now generated and tested for.

> Create IndexWriter event listener, specifically for merges
> --
>
> Key: LUCENE-2856
> URL: https://issues.apache.org/jira/browse/LUCENE-2856
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
> Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch
>
>
> The issue will allow users to monitor merges occurring within IndexWriter 
> using a callback notifier event listener.  This can be used by external 
> applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Stanislaw Osinski (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980881#action_12980881
 ] 

Stanislaw Osinski commented on SOLR-2282:
-

Sure, I'll take a look at it tomorrow morning. 

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-12 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-2856:
-

Attachment: LUCENE-2856.patch

Here's a first cut including workarounds to avoid NPEs and file not found 
exceptions in SegmentInfo (when calling size in bytes).  There's a test case 
for merge init, start, and complete.  I need to add one for abort.

> Create IndexWriter event listener, specifically for merges
> --
>
> Key: LUCENE-2856
> URL: https://issues.apache.org/jira/browse/LUCENE-2856
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
> Attachments: LUCENE-2856.patch, LUCENE-2856.patch
>
>
> The issue will allow users to monitor merges occurring within IndexWriter 
> using a callback notifier event listener.  This can be used by external 
> applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SegmentInfo clone

2011-01-12 Thread Simon Willnauer

On Wed, Jan 12, 2011 at 8:03 PM, Jason Rutherglen
 wrote:
> Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs
> because it's never set (in trunk).
it is set on DocumentsWriter#flush though

simon
>
> On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen
>  wrote:
>> Is it intentional that SegmentInfo.segmentCodecs isn't cloned?  When
>> SI is cloned, then sizeInBytes fails with an NPE.
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SegmentInfo clone

2011-01-12 Thread Jason Rutherglen

Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs
because it's never set (in trunk).

On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen
 wrote:
> Is it intentional that SegmentInfo.segmentCodecs isn't cloned?  When
> SI is cloned, then sizeInBytes fails with an NPE.
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: call python from java - what strategy do you use?

2011-01-12 Thread Andi Vajda



On Wed, 12 Jan 2011, Roman Chyla wrote:


And if in the python, I will do:

import lucene
import lucene.initVM(lucene.CLASSPATH)

Will it work in this case? Giving access to the java classes from
inside python. Or I will have to forget pylucene, and prepare some
extra java classes? (the jcc in reverse trick, as you put it)


Yes, just be sure to JCC-build your eggs with --import lucene so that you 
don't wrap lucene multiple times.



- you say that threads are not managed by the Python VM, does that
mean there is no Python GIL?


No, there is a Pythonn GIL (and that is the Achille's Heel of this setup if
you expect high concurrent servlet performance from your server calling
Python). That Python GIL is connected to this thread state I was mentioning
earlier. Because the thread is not managed by Python, when Python is called
(by way of the code generated by JCC) it doesn't find a thread state for the
thread and creates one. When the call completes, the thread state is
destroyed because its refcount goes to zero. My TerminatingThread class
acquires a Python thread state and keeps it for the life of the thread,
thereby working this problem around.


OK, this then looks like a normal Python - which is somehow making me
less worried :) I wanted to use multiprocessing inside python to deal
with GIL, and I see no reason why it should not work in this case.


I tried that approach originally and gave up. There were too many strange 
lock-ups talking to python subprocesses managed by multiprocessing. Now, 
this was before it became part of Python's distribution so maybe bugs got 
fixed since then.


Andi..

SegmentInfo clone

2011-01-12 Thread Jason Rutherglen

Is it intentional that SegmentInfo.segmentCodecs isn't cloned?  When
SI is cloned, then sizeInBytes fails with an NPE.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: call python from java - what strategy do you use?

2011-01-12 Thread Andi Vajda



On Wed, 12 Jan 2011, Roman Chyla wrote:


Hi Andi, all,

I tried to implement the PythonVM wrapping on Mac 10.6, with JDK
1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is
the standard Python distributed with MacOsX

When I try to run the java, it throws an error when it gets to:

static {
   System.loadLibrary("jcc");
   }

I am getting this error:

Exception in thread "main" java.lang.UnsatisfiedLinkError:
/Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib:
Symbol not found: _PyExc_RuntimeError   Referenced from:


That's because Python's shared library wasn't found. The reason is that, by 
default, Python's shared lib not on JCC's link line because normally JCC is 
loaded into a Python process and the dynamic linker thus finds the symbols 
needed inside the process.


Here, since you're not starting inside a Python process, you need to add 
'-framework Python' to JCC's LFLAGS in setup.py so that the dynamic linker 
can find the Python VM shared lib and load it.


Andi..


/Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib
 Expected in: flat namespace  in
/Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1823)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1746)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1045)
at org.apache.jcc.PythonVM.(PythonVM.java:23)
at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:53)
at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:139)


MacBeth:JCC-2.6-py2.6-macosx-10.6-universal.egg rca$ nm libjcc.dylib | grep Exc
U _PyExc_RuntimeError
U _PyExc_TypeError
U _PyExc_ValueError
3442 T __ZNK6JCCEnv15reportExceptionEv
21f0 T __ZNK6JCCEnv23getPythonExceptionClassEv


Any pointers what I could do wrong? Note, I haven't built any emql.egg
yet, I just run my java program and try to start PythonVM() and see if
that works.

Thanks,

 roman



On Wed, Jan 12, 2011 at 11:05 AM, Roman Chyla  wrote:

Hi Andi,

I think I will give it a try, if only because I am curious. Please see
one remaining question below.


On Tue, Jan 11, 2011 at 10:37 PM, Andi Vajda  wrote:



On Tue, 11 Jan 2011, Roman Chyla wrote:


Hi Andy,

This is much more than I could have hoped! Just yesterday, I was
looking for ways how to embed Python VM in Jetty, as that would be
more natural, but found only jepp.sourceforge.net and off-putting was
the necessity to compile it against the newly built python. I could
not want it from the guys who may need my extension. And I realize
only now, that embedding Python in Java is even documented on the
website, but honestly i would not know how to do it without your
detailed examples.

Now to the questions, I apologize, some of them or all must seem very
stupid to you

- pylucene is used on many platforms and with jcc always worked as
expected (i love it!), but is it as reliable in the opposite
direction? The PythonVM.java loads "jcc" library, so I wonder if in
principle there is any difference in the directionality - but I am not
sure. To rephrase my convoluted question: would you expect this
wrapping be as reliable as wrapping java inside python is now?


I've been using this for over two years, in production.
My main worry was memory leaks because a server process is expected to stay
up and running for weeks at a time and it's been very stable on that front
too. Of course, when there is a bug somewhere that causes your Python VM to
crash, the entire server crashes. Just like when the JVM crashes (which is
normally rare). In other words, this isn't any less reliable than a
standalone Python VM process. It can be tricky, but is possible, to run gdb,
pdb and jdb together to step through the three languages involved, python,
java and C++. I've had to do this a few times but not in a long time.


- in the past, i built jcc libraries on one host and distributed them on
various machines. As long the family OS and the python main version were the
same, it worked on Win/Lin/Mac just fine. As far as I can tell, this does
not change, or will it be dependent on the python against which the egg was
built?


Distributing binaries is risky. The same caveats apply. I wouldn't do it,
even in the simple PyLucene case.


unfortunately, I don't have that many choices left - this is not for
some client-software scenario, we are running the jobs on the grid,
and there I cannot compile the binaries. So, if previously the
location of the python interpreter or python minor version did not
cause problems, now perhaps it will be different. But that wasn't for
the Solr, wrapping Solr is not meant for the grid.




- now a little tricky issue; when

Lucene-Solr-tests-only-3.x - Build # 3681 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3681/

1 tests failed.
REGRESSION:  org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety

Error Message:
unable to create new native thread

Stack Trace:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:614)
at 
org.apache.lucene.search.TestThreadSafe.doTest(TestThreadSafe.java:133)
at 
org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety(TestThreadSafe.java:152)
at 
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255)




Build Log (for compile errors):
[...truncated 8574 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980853#action_12980853
 ] 

Salman Akram commented on SOLR-1604:


I am using SOLR 1.4.1 but integrated this patch in early Nov so maybe you 
committed the inOrder parameter after that?

When you say "Regarding parenthesis inside quotes..." if this works and groups 
the words in phrase together won't it work for my case e.g. "(a b) c"~10?

I guess if SurroundQuery doesn't use any analyzer it would be very difficult to 
make the existing queries work (I am using Standard Analyzer).

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: Next
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2313) Clear root Entity cache when entity is processed

2011-01-12 Thread Shane (JIRA)

Clear root Entity cache when entity is processed


 Key: SOLR-2313
 URL: https://issues.apache.org/jira/browse/SOLR-2313
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.4.1
 Environment: Linux, JDBC, Postgres 8.4.6
Reporter: Shane


The current process clears the entity caches once all root entity elements have 
been imported.  When a config file has dozens of root entities, the result is 
one "idle in transaction" process for each entity processed, effectively eating 
up the databases available connections.  The simple solution would be to clear 
a root entity's cache once that entity has been processed.

The following is a diff that I used in my instance to clear the cache when the 
entity completed:

--- DocBuilder.java 2011-01-12 10:05:58.0 -0700
+++ DocBuilder.java.new 2011-01-12 10:05:31.0 -0700
@@ -435,6 +435,9 @@
 writer.log(SolrWriter.END_ENTITY, null, null);
   }
   entityProcessor.destroy();
+   if(entity.isDocRoot) {
+   entity.clearCache();
+   }
 }
   }


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980808#action_12980808
 ] 

Robert Muir commented on LUCENE-2751:
-

There is a downside to this whole issue of course... i think its going to be 
harder to reproduce test fails since we will be using more multithreading.

But I think its a worthwhile tradeoff in being able to detect more 
thread-safety bugs.
If it becomes a huge hassle, we could always disable it by default or enable 
only with a flag or something like that.


> add LuceneTestCase.newSearcher()
> 
>
> Key: LUCENE-2751
> URL: https://issues.apache.org/jira/browse/LUCENE-2751
> Project: Lucene - Java
>  Issue Type: Test
>  Components: Build
>Reporter: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2751.patch, LUCENE-2751.patch
>
>
> Most tests in the search package don't care about what kind of searcher they 
> use.
> we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in 
> tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980805#action_12980805
 ] 

Simon Willnauer commented on LUCENE-2694:
-

I just figured that PKLookups are actually slower with this patch 164 msec for 
1000 lookups (164 us per lookup) vs 144 msec for 1000 lookups (144 us per 
lookup) on trunk. I will dig!

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-12 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980803#action_12980803
 ] 

Jason Rutherglen commented on LUCENE-2856:
--

I'll added events for flush, open, clone, close and the 
CompositeSegmentsListener.  

> Create IndexWriter event listener, specifically for merges
> --
>
> Key: LUCENE-2856
> URL: https://issues.apache.org/jira/browse/LUCENE-2856
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
> Attachments: LUCENE-2856.patch
>
>
> The issue will allow users to monitor merges occurring within IndexWriter 
> using a callback notifier event listener.  This can be used by external 
> applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2694:


Attachment: LUCENE-2694.patch

Added Changes.txt entry and fixed the remaining JavaDoc on TermState. 

My latest benchmark results with that patch are here:
{code}

  unit state3.813.70 -2.9%
+nebraska +state   41.26   40.61 -1.6%
+unit +state3.953.90 -1.1%
  spanFirst(unit, 5)4.554.51 -0.9%
   state   10.11   10.07 -0.3%
  "unit state"~30.980.98 -0.2%
"unit state"1.491.49 -0.0%
  united~1.03.663.72  1.5%
unit~1.02.332.37  1.6%
  united~2.00.810.83  2.7%
unit~2.00.350.38 10.1%
 u*d0.520.67 29.5%
doctitle:.*[Uu]nited.*0.190.25 31.6%
un*d3.594.77 33.0%
uni*0.560.75 34.9%
   unit*2.203.15 43.3%
{code}

I think we are ready to go - I will commit later today if nobody objects

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-3.x - Build # 238 - Failure

2011-01-12 Thread Robert Muir

On Wed, Jan 12, 2011 at 6:09 AM, Michael McCandless
 wrote:
> Can we do something about these "false" failures?
>
> They are failing because of this Hudson bug:
>
>    http://issues.hudson-ci.org/browse/HUDSON-7836
>
> But the highish failure rate makes us look bad when people look at our
> build stability... which is awful.
>

One idea would be to divorce the clover, etc from the nightly builds,
and make a clover build.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stan Burnitt updated SOLR-2312:
---

Description: 
Cannot index documents using the CloudSolrServer.

Below is a code snippet that reproduces the error.

{code:borderStyle=solid}
@Test
public void jiraTestCase() {
CloudSolrServer solrj = null;
 
try {
solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181");

// Also tried creating CloudSolrServer using 
alternative contstuctor below...
// public CloudSolrServer(String zkHost, 
LBHttpSolrServer lbServer)
//
// LBHttpSolrServer lbHttpSolrServer = new 
LBHttpSolrServer("http://solr.localdomain:8983/solr";);
// solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
//
// (Same result -- NPE @ line 105 in 
CloudSolrServer.java)

solrj.setDefaultCollection("your-collection");
solrj.setZkClientTimeout(5000);
solrj.setZkConnectTimeout(5000);

final Collection batch = new 
ArrayList();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 1L, 1.0f);
doc.addField("title", "Document A");
doc.addField("description", "Test document");
batch.add(doc);

doc = new SolrInputDocument();
doc.addField("id", 2L, 1.0f);
doc.addField("title", "Document B");
doc.addField("description", "Another test 
document");
batch.add(doc);

solrj.add(batch);

} catch (Exception e) {
log.error(e.getMessage(), e);
Assert.fail("java.lang.NullPointerException: 
null \n"
+ " at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
 \n"
+ " Line 105:  NULL request object here 
--> String collection = request.getParams().get(\"collection\", 
defaultCollection);");
} finally {
solrj.close();
}
}
{code} 

  was:
Cannot index documents.
Below is a code snippet that reproduces the error.

{code:borderStyle=solid}
@Test
public void jiraTestCase() {
CloudSolrServer solrj = null;
 
try {
solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181");

// Also tried creating CloudSolrServer using 
alternative contstuctor below...
// public CloudSolrServer(String zkHost, 
LBHttpSolrServer lbServer)
//
// LBHttpSolrServer lbHttpSolrServer = new 
LBHttpSolrServer("http://solr.localdomain:8983/solr";);
// solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
//
// (Same result -- NPE @ line 105 in 
CloudSolrServer.java)

solrj.setDefaultCollection("your-collection");
solrj.setZkClientTimeout(5000);
solrj.setZkConnectTimeout(5000);

final Collection batch = new 
ArrayList();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 1L, 1.0f);
doc.addField("title", "Document A");
doc.addField("description", "Test document");
batch.add(doc);

doc = new SolrInputDocument();
doc.addField("id", 2L, 1.0f);
doc.addField("title", "Document B");
doc.addField("description", "Another test 
document");
batch.add(doc);

solrj.add(batch);

} catch (Exception e) {
log.error(e.getMessage(), e);
Assert.fail("java.lang.NullPointerException: 
null \n"

[jira] Commented: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980764#action_12980764
 ] 

Stan Burnitt commented on SOLR-2312:


Attempting to add a single document also results in the same NPE at line 105.

> CloudSolrServer -- calling add(Collection docs) throws NPE.
> --
>
> Key: SOLR-2312
> URL: https://issues.apache.org/jira/browse/SOLR-2312
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
> Environment: Mac OSX  v10.5.8
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>Reporter: Stan Burnitt
>Priority: Critical
> Fix For: 4.0
>
>
> Cannot index documents.
> Below is a code snippet that reproduces the error.
> {code:borderStyle=solid}
> @Test
> public void jiraTestCase() {
>   CloudSolrServer solrj = null;
>
>   try {
>   solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181");
>   // Also tried creating CloudSolrServer using 
> alternative contstuctor below...
>   // public CloudSolrServer(String zkHost, 
> LBHttpSolrServer lbServer)
>   //
>   // LBHttpSolrServer lbHttpSolrServer = new 
> LBHttpSolrServer("http://solr.localdomain:8983/solr";);
>   // solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
>   //
>   // (Same result -- NPE @ line 105 in 
> CloudSolrServer.java)
>   solrj.setDefaultCollection("your-collection");
>   solrj.setZkClientTimeout(5000);
>   solrj.setZkConnectTimeout(5000);
>   final Collection batch = new 
> ArrayList();
>   SolrInputDocument doc = new SolrInputDocument();
>   doc.addField("id", 1L, 1.0f);
>   doc.addField("title", "Document A");
>   doc.addField("description", "Test document");
>   batch.add(doc);
>   doc = new SolrInputDocument();
>   doc.addField("id", 2L, 1.0f);
>   doc.addField("title", "Document B");
>   doc.addField("description", "Another test 
> document");
>   batch.add(doc);
>   solrj.add(batch);
>   } catch (Exception e) {
>   log.error(e.getMessage(), e);
>   Assert.fail("java.lang.NullPointerException: 
> null \n"
>   + " at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
>  \n"
>   + " Line 105:  NULL request object here 
> --> String collection = request.getParams().get(\"collection\", 
> defaultCollection);");
>   } finally {
>   solrj.close();
>   }
> }
> {code} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2282:
--

Attachment: SOLR-2282_test.patch

here's a patch to fix the BaseDistributedTestCase, so clustering and other 
contribs can set their own home and use it.

this fixes the unknown field problem, but i'm still seeing the zzBuffer array 
index out of bounds exception... perhaps 
my checkout is somehow out of date... maybe you can test the patch?


> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980758#action_12980758
 ] 

Michael McCandless commented on LUCENE-2751:


bq. I think this would be preferred before we go optimizing synchronization, 
because otherwise how do we know if its correct?

+1

> add LuceneTestCase.newSearcher()
> 
>
> Key: LUCENE-2751
> URL: https://issues.apache.org/jira/browse/LUCENE-2751
> Project: Lucene - Java
>  Issue Type: Test
>  Components: Build
>Reporter: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2751.patch, LUCENE-2751.patch
>
>
> Most tests in the search package don't care about what kind of searcher they 
> use.
> we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in 
> tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-Solr-tests-only-3.x - Build # 3669 - Failure

2011-01-12 Thread Michael McCandless

I committed one possible fix for this... the backwards test was
missing a cms.sync().  But I'm not sure that's the cause of the
failure...

Mike

On Wed, Jan 12, 2011 at 8:03 AM, Apache Hudson Server
 wrote:
> Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3669/
>
> 1 tests failed.
> REGRESSION:  org.apache.lucene.index.TestCrash.testWriterAfterCrash
>
> Error Message:
> MockRAMDirectory: file "_0.tis" is still open: cannot overwrite
>
> Stack Trace:
> java.io.IOException: MockRAMDirectory: file "_0.tis" is still open: cannot 
> overwrite
>        at 
> org.apache.lucene.store.MockRAMDirectory.createOutput(MockRAMDirectory.java:221)
>        at 
> org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:100)
>        at 
> org.apache.lucene.index.TermInfosWriter.(TermInfosWriter.java:85)
>        at 
> org.apache.lucene.index.FormatPostingsFieldsWriter.(FormatPostingsFieldsWriter.java:41)
>        at 
> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:84)
>        at org.apache.lucene.index.TermsHash.flush(TermsHash.java:109)
>        at org.apache.lucene.index.DocInverter.flush(DocInverter.java:72)
>        at 
> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:59)
>        at 
> org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:589)
>        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3299)
>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3264)
>        at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040)
>        at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2007)
>        at org.apache.lucene.index.TestCrash.initIndex(TestCrash.java:51)
>        at 
> org.apache.lucene.index.TestCrash.testWriterAfterCrash(TestCrash.java:77)
>        at 
> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 8559 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980756#action_12980756
 ] 

Stan Burnitt edited comment on SOLR-2312 at 1/12/11 10:45 AM:
--

The CloudSolrServer is always instantiated with a LBHttpSolrServer, created in 
the single argument construct, or passed by the user in the alternative 
constructor.

However, the LBHttpSolrServer's javadoc states:  "LBHttpSolrServer  is a load 
balancing wrapper to CommonsHttpSolrServer. This is useful when you have 
multiple SolrServers and the requests need to be Load Balanced among them. This 
should *NOT* be used for indexing. Also see the wiki page."  (Not sure if this 
is relevant.)

One more note: this problem also occurs when I try to delete an index 
containing 0 documents.

  was (Author: stan-b):
The CloudSolrServer is always instantiated with a LBHttpSolrServer, created 
in the single argument construct, or passed by the user in the alternative 
constructor.

However, the LBHttpSolrServer's javadoc states:  "LBHttpSolrServer  is a load 
balancing wrapper to CommonsHttpSolrServer. This is useful when you have 
multiple SolrServers and the requests need to be Load Balanced among them. This 
should NOT be used for indexing. Also see the wiki page."  (Not sure if 
this is relevant.)

One more note: this problem also occurs when I try to delete an index 
containing 0 documents.
  
> CloudSolrServer -- calling add(Collection docs) throws NPE.
> --
>
> Key: SOLR-2312
> URL: https://issues.apache.org/jira/browse/SOLR-2312
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
> Environment: Mac OSX  v10.5.8
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>Reporter: Stan Burnitt
>Priority: Critical
> Fix For: 4.0
>
>
> Cannot index documents.
> Below is a code snippet that reproduces the error.
> Cannot index documents.
> Below is a snippet for reproducing the error.
> {code:borderStyle=solid}
>   @Test
>   public void jiraTestCase() {
>   CloudSolrServer solrj = null;
>
>   try {
>   solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181");
>   // Also tried creating CloudSolrServer using 
> alternative contstuctor below...
>   // public CloudSolrServer(String zkHost, 
> LBHttpSolrServer lbServer)
>   //
>   // LBHttpSolrServer lbHttpSolrServer = new 
> LBHttpSolrServer("http://solr.localdomain:8983/solr";);
>   // solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
>   //
>   // (Same result -- NPE @ line 105 in 
> CloudSolrServer.java)
>   solrj.setDefaultCollection("your-collection");
>   solrj.setZkClientTimeout(5000);
>   solrj.setZkConnectTimeout(5000);
>   final Collection batch = new 
> ArrayList();
>   SolrInputDocument doc = new SolrInputDocument();
>   doc.addField("id", 1L, 1.0f);
>   doc.addField("title", "Document A");
>   doc.addField("description", "Test document");
>   batch.add(doc);
>   doc = new SolrInputDocument();
>   doc.addField("id", 2L, 1.0f);
>   doc.addField("title", "Document B");
>   doc.addField("description", "Another test 
> document");
>   batch.add(doc);
>   solrj.add(batch);
>   } catch (Exception e) {
>   log.error(e.getMessage(), e);
>   Assert.fail("java.lang.NullPointerException: 
> null \n"
>   + " at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
>  \n"
>   + " Line 105:  NULL request object here 
> --> String collection = request.getParams().get(\"collection\", 
> defaultCollection);");
>   } finally {
>   solrj.close();
>   }
>   }
> {code} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to

[jira] Updated: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stan Burnitt updated SOLR-2312:
---

Description: 
Cannot index documents.
Below is a code snippet that reproduces the error.

{code:borderStyle=solid}
@Test
public void jiraTestCase() {
CloudSolrServer solrj = null;
 
try {
solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181");

// Also tried creating CloudSolrServer using 
alternative contstuctor below...
// public CloudSolrServer(String zkHost, 
LBHttpSolrServer lbServer)
//
// LBHttpSolrServer lbHttpSolrServer = new 
LBHttpSolrServer("http://solr.localdomain:8983/solr";);
// solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
//
// (Same result -- NPE @ line 105 in 
CloudSolrServer.java)

solrj.setDefaultCollection("your-collection");
solrj.setZkClientTimeout(5000);
solrj.setZkConnectTimeout(5000);

final Collection batch = new 
ArrayList();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 1L, 1.0f);
doc.addField("title", "Document A");
doc.addField("description", "Test document");
batch.add(doc);

doc = new SolrInputDocument();
doc.addField("id", 2L, 1.0f);
doc.addField("title", "Document B");
doc.addField("description", "Another test 
document");
batch.add(doc);

solrj.add(batch);

} catch (Exception e) {
log.error(e.getMessage(), e);
Assert.fail("java.lang.NullPointerException: 
null \n"
+ " at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
 \n"
+ " Line 105:  NULL request object here 
--> String collection = request.getParams().get(\"collection\", 
defaultCollection);");
} finally {
solrj.close();
}
}
{code} 

  was:
Cannot index documents.
Below is a code snippet that reproduces the error.

Cannot index documents.
Below is a snippet for reproducing the error.

{code:borderStyle=solid}
@Test
public void jiraTestCase() {
CloudSolrServer solrj = null;
 
try {
solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181");

// Also tried creating CloudSolrServer using 
alternative contstuctor below...
// public CloudSolrServer(String zkHost, 
LBHttpSolrServer lbServer)
//
// LBHttpSolrServer lbHttpSolrServer = new 
LBHttpSolrServer("http://solr.localdomain:8983/solr";);
// solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
//
// (Same result -- NPE @ line 105 in 
CloudSolrServer.java)

solrj.setDefaultCollection("your-collection");
solrj.setZkClientTimeout(5000);
solrj.setZkConnectTimeout(5000);

final Collection batch = new 
ArrayList();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 1L, 1.0f);
doc.addField("title", "Document A");
doc.addField("description", "Test document");
batch.add(doc);

doc = new SolrInputDocument();
doc.addField("id", 2L, 1.0f);
doc.addField("title", "Document B");
doc.addField("description", "Another test 
document");
batch.add(doc);

solrj.add(batch);

} catch (Exception e) {
log.error(e.getMessage(), e);
Asse

[jira] Commented: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980756#action_12980756
 ] 

Stan Burnitt commented on SOLR-2312:


The CloudSolrServer is always instantiated with a LBHttpSolrServer, created in 
the single argument construct, or passed by the user in the alternative 
constructor.

However, the LBHttpSolrServer's javadoc states:  "LBHttpSolrServer  is a load 
balancing wrapper to CommonsHttpSolrServer. This is useful when you have 
multiple SolrServers and the requests need to be Load Balanced among them. This 
should NOT be used for indexing. Also see the wiki page."  (Not sure if 
this is relevant.)

One more note: this problem also occurs when I try to delete an index 
containing 0 documents.

> CloudSolrServer -- calling add(Collection docs) throws NPE.
> --
>
> Key: SOLR-2312
> URL: https://issues.apache.org/jira/browse/SOLR-2312
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
> Environment: Mac OSX  v10.5.8
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263)
> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>Reporter: Stan Burnitt
>Priority: Critical
> Fix For: 4.0
>
>
> Cannot index documents.
> Below is a code snippet that reproduces the error.
> Cannot index documents.
> Below is a snippet for reproducing the error.
> {code:borderStyle=solid}
>   @Test
>   public void jiraTestCase() {
>   CloudSolrServer solrj = null;
>
>   try {
>   solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181");
>   // Also tried creating CloudSolrServer using 
> alternative contstuctor below...
>   // public CloudSolrServer(String zkHost, 
> LBHttpSolrServer lbServer)
>   //
>   // LBHttpSolrServer lbHttpSolrServer = new 
> LBHttpSolrServer("http://solr.localdomain:8983/solr";);
>   // solrj = new 
> CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
>   //
>   // (Same result -- NPE @ line 105 in 
> CloudSolrServer.java)
>   solrj.setDefaultCollection("your-collection");
>   solrj.setZkClientTimeout(5000);
>   solrj.setZkConnectTimeout(5000);
>   final Collection batch = new 
> ArrayList();
>   SolrInputDocument doc = new SolrInputDocument();
>   doc.addField("id", 1L, 1.0f);
>   doc.addField("title", "Document A");
>   doc.addField("description", "Test document");
>   batch.add(doc);
>   doc = new SolrInputDocument();
>   doc.addField("id", 2L, 1.0f);
>   doc.addField("title", "Document B");
>   doc.addField("description", "Another test 
> document");
>   batch.add(doc);
>   solrj.add(batch);
>   } catch (Exception e) {
>   log.error(e.getMessage(), e);
>   Assert.fail("java.lang.NullPointerException: 
> null \n"
>   + " at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
>  \n"
>   + " Line 105:  NULL request object here 
> --> String collection = request.getParams().get(\"collection\", 
> defaultCollection);");
>   } finally {
>   solrj.close();
>   }
>   }
> {code} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.

2011-01-12 Thread Stan Burnitt (JIRA)

CloudSolrServer -- calling add(Collection docs) throws NPE.
--

 Key: SOLR-2312
 URL: https://issues.apache.org/jira/browse/SOLR-2312
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: Mac OSX  v10.5.8
java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263)
Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
Reporter: Stan Burnitt
Priority: Critical
 Fix For: 4.0


Cannot index documents.
Below is a code snippet that reproduces the error.

Cannot index documents.
Below is a snippet for reproducing the error.

{code:borderStyle=solid}
@Test
public void jiraTestCase() {
CloudSolrServer solrj = null;
 
try {
solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181");

// Also tried creating CloudSolrServer using 
alternative contstuctor below...
// public CloudSolrServer(String zkHost, 
LBHttpSolrServer lbServer)
//
// LBHttpSolrServer lbHttpSolrServer = new 
LBHttpSolrServer("http://solr.localdomain:8983/solr";);
// solrj = new 
CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer);
//
// (Same result -- NPE @ line 105 in 
CloudSolrServer.java)

solrj.setDefaultCollection("your-collection");
solrj.setZkClientTimeout(5000);
solrj.setZkConnectTimeout(5000);

final Collection batch = new 
ArrayList();
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", 1L, 1.0f);
doc.addField("title", "Document A");
doc.addField("description", "Test document");
batch.add(doc);

doc = new SolrInputDocument();
doc.addField("id", 2L, 1.0f);
doc.addField("title", "Document B");
doc.addField("description", "Another test 
document");
batch.add(doc);

solrj.add(batch);

} catch (Exception e) {
log.error(e.getMessage(), e);
Assert.fail("java.lang.NullPointerException: 
null \n"
+ " at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105)
 \n"
+ " Line 105:  NULL request object here 
--> String collection = request.getParams().get(\"collection\", 
defaultCollection);");
} finally {
solrj.close();
}
}
{code} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980749#action_12980749
 ] 

Robert Muir commented on SOLR-2282:
---

sorry guys, i screwed this up, by not adding logic to the 
BaseDistributedTestCase to make it work for contribs, from resources.

I saw that it extended SolrTestCaseJ4 but I neglected to realize that it doesnt 
use initCore, so i'll take a look at fixing this.

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980748#action_12980748
 ] 

Robert Muir commented on LUCENE-2694:
-

bq. seems like this time perf rules out purity in the interface.

I know i didn't like this aspect of the patch, but I am ok with it for now as 
long as we keep things experimental and try to keep an eye on improving the 
'purity' of TermsEnum a bit.
we are making a lot of progress on the terms handling with flexible indexing 
and i could easily see more interesting implementations being available other 
than just PrefixCoded...
In some ideal world I guess i'd prefer if TermsEnum was an attributesource with 
seek() and next(), FilteredTermsEnum was like tokenFilter, and TermState was 
just captureState/restoreState...
but I agree we should just lean towards whatever works for now.

definitely like it better now that things such as docFreq() are pulled out of 
termstate and its completely opaque, i think this is the right way to go.

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1058162 - in /lucene/dev/trunk/solr/contrib/clustering/src/test: java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java resources/solr-clustering/ resources/solr/

2011-01-12 Thread Robert Muir

I don't think we should do this, as it really confuses the build.

Because test-resources are now resources, their contents should not
conflict with each other.
this is why contribs now have solr-XXX directories.

If you want to change this in SolrTestCaseJ4: use initCore(xxx, yyy,
"solr-clustering")
if you want to change this in AbstractSolrTestCase: you override:
  public String getSolrHome()

Can you please rename the directory back? It seems we just need to fix
BaseDistributedTestCase to allow you to override this parameter, i can
help


On Wed, Jan 12, 2011 at 9:58 AM,   wrote:
> Author: koji
> Date: Wed Jan 12 14:58:49 2011
> New Revision: 1058162
>
> URL: http://svn.apache.org/viewvc?rev=1058162&view=rev
> Log:
> SOLR-2282: rename solr-clustering to solr
>
> Added:
>    lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr/
>      - copied from r1058152, 
> lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr-clustering/
> Removed:
>    
> lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr-clustering/
> Modified:
>    
> lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java
>
> Modified: 
> lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java
> URL: 
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java?rev=1058162&r1=1058161&r2=1058162&view=diff
> ==
> --- 
> lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java
>  (original)
> +++ 
> lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java
>  Wed Jan 12 14:58:49 2011
> @@ -28,7 +28,7 @@ public abstract class AbstractClustering
>
>   @BeforeClass
>   public static void beforeClass() throws Exception {
> -    initCore("solrconfig.xml", "schema.xml", "solr-clustering");
> +    initCore("solrconfig.xml", "schema.xml", "solr");
>     numberOfDocs = 0;
>     for (String[] doc : DOCUMENTS) {
>       assertNull(h.validateUpdate(adoc("id", Integer.toString(numberOfDocs), 
> "url", doc[0], "title", doc[1], "snippet", doc[2])));
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980741#action_12980741
 ] 

Koji Sekiguchi commented on SOLR-2282:
--

I've committed the fix for "unknown field 'url'".

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980740#action_12980740
 ] 

Robert Muir commented on LUCENE-2793:
-

{quote}
I'm perfectly OK with that approach (having some module FSDir checks). I also 
feel uneasy having JNI in core.
What I don't want to see, is Directory impls that you can't use on their own. 
If you can only use it for merging, then it's not a Directory, it breaks the 
contract! - move the code elsewhere.
{quote}

Right, i think we all agree we want to fix the DirectIOLinuxDirectory into 
being a 'real' directory?

As i said before, from a practical perspective, it could be named 
LinuxDirectory, extend NIOFS, 
and when openInput(IOContext=Merge) it opens its special input. but personally 
i don't care how 
we actually implement it 'becoming a real directory'. this is another issue, 
unrelated to this one really.

this issue is enough and should stand on its own... we should be able to do 
enough nice things
here without dealing with JNI: improving our existing directory impls to use 
larger buffer sizes by 
default when merging, etc (like in your example).

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980736#action_12980736
 ] 

Earwin Burrfoot commented on LUCENE-2793:
-

{quote}
As I said before though, i wouldn't mind if we had something more like a 
'modules/native' and FSDirectory checked, if this was available and 
automagically used it...
but I can't see myself thinking that we should put this logic into fsdir 
itself, sorry. 
{quote}
I'm perfectly OK with that approach (having some module FSDir checks). I also 
feel uneasy having JNI in core.
What I don't want to see, is Directory impls that you can't use on their own. 
If you can only use it for merging, then it's not a Directory, it breaks the 
contract! - move the code elsewhere.

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980735#action_12980735
 ] 

Robert Muir commented on LUCENE-2751:
-

We should be able to actually implement this issue now right?

Implement LuceneTestCase.newSearcher(), and like the previous patch, sometimes 
use parallel in tests.
I think this would be preferred before we go optimizing synchronization, 
because otherwise how do we know if its correct?


> add LuceneTestCase.newSearcher()
> 
>
> Key: LUCENE-2751
> URL: https://issues.apache.org/jira/browse/LUCENE-2751
> Project: Lucene - Java
>  Issue Type: Test
>  Components: Build
>Reporter: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2751.patch, LUCENE-2751.patch
>
>
> Most tests in the search package don't care about what kind of searcher they 
> use.
> we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in 
> tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980732#action_12980732
 ] 

Earwin Burrfoot commented on LUCENE-2793:
-

bq. Because in your example code above, it looks like it's added to Directory 
itself.
bq. My problem with your sample code is that it appears that the .setBufferSize 
method is on Directory itself. 

Ohoho. My fault, sorry. It should look like:
{code}
RAMDirectory ramDir = new RAMDirectory();
ramDir.setBufferSize(whatever) // Compilation error!
ramDir.createIndexInput(name, context);

NIOFSDirectory fsDir = new NIOFSDirectory();
fsDir.setBufferSize(IOContext.NORMAL_READ, 1024);
fsDir.setBufferSize(IOContext.MERGE, 4096);
fsDir.createIndexInput(name, context)
{code}

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980731#action_12980731
 ] 

Robert Muir commented on LUCENE-2793:
-

bq. The proper way is to test out the things, and then move DirectIO code to 
the only place it makes sense in - FSDir? Probably make it switch on/off-able, 
maybe not. 

I'm not sure it should be there... at least not soon. its not even something 
you can implement in pure java? 
we definitely have to keep it still simple and possible for people to use the 
java library in a platform-indepedent way. 
its also a bit dangerous, whenever JNI is involvedeven if its working. 

So I think its craziness, to put this direct-io stuff in fsdirectory itself. 

As I said before though, i wouldn't mind if we had something more like a 
'modules/native' and FSDirectory checked, if this was available and 
automagically used it... 
but I can't see myself thinking that we should put this logic into fsdir 
itself, sorry. 

bq. Sample code 

My problem with your sample code is that it appears that the .setBufferSize 
method is on Directory itself. 
Again i disagree with this because: 
* its useless to certain directories like MMapDirectory 
* its dangerous in the direct-io case (different platforms have strict 
requirements that things be sector-aligned etc, see the mac case where it 
actually 'works' if the buffer isnt, but is just slow). 

I definitely don't like the confusion regarding buffersizes now. A very small % 
of the time its actually meaningful and should be respected, 
but most of the time the value is completely bogus. 

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching

2011-01-12 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2860.


Resolution: Fixed

Committed revision 1058147 (3x).
Committed revision 1058155 (trunk).

> SegmentInfo.sizeInBytes ignore includeDocStore when caching
> ---
>
> Key: LUCENE-2860
> URL: https://issues.apache.org/jira/browse/LUCENE-2860
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2860.patch
>
>
> I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it 
> doesn't take into account 'includeDocStores'. I.e., if you call it once w/ 
> 'false' (sizeInBytes won't include the store files) and then with 'true' (or 
> vice versa), you won't get the right sizeInBytes (it won't re-compute, with 
> the store files).
> I'll fix and add a test case demonstrating the bug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980727#action_12980727
 ] 

Shai Erera commented on LUCENE-2793:


I assume you mean setBufferSize(IOContext, size) should be added to specific 
Directory impls, and not Directory? Because in your example code above, it 
looks like it's added to Directory itself. Though we can add it to Directory as 
well, and do nothing there. It simplifies matters as you don't need to check 
whether the Dir you receive supports setting buffer size (in case you're not 
the one creating it).

At any rate, this looks like it can work too.

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2694:


Attachment: LUCENE-2694.patch

This patch changes TermsEnum#seek(TermState) back to TermsEnum#seek(BytesRef, 
TermState). Yet, TermState is opaque now and TermsEnum has a default impl for 
TermsEnum#seek(BytesRef, TermState). Holding the BytesRef in TermState for our 
PrefixCoded* based codecs seems way too costly though. seems like this time 
perf rules out purity in the interface.

> MTQ rewrite + weight/scorer init should be single pass
> --
>
> Key: LUCENE-2694
> URL: https://issues.apache.org/jira/browse/LUCENE-2694
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, 
> LUCENE-2694.patch, LUCENE-2694_hack.patch
>
>
> Spinoff of LUCENE-2690 (see the hacked patch on that issue)...
> Once we fix MTQ rewrite to be per-segment, we should take it further and make 
> weight/scorer init also run in the same single pass as rewrite.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980702#action_12980702
 ] 

Ahmet Arslan commented on SOLR-1604:


Salman, what you are after is nested proximity search which is not currently 
available in Solr. For alternatives see http://search-lucene.com/m/94ONm1KRuAv1/

Regarding parenthesis inside quotes and inOrder, they are covered in TestCases.

What version of solr did you use? 

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: Next
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-Solr-tests-only-3.x - Build # 3669 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3669/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestCrash.testWriterAfterCrash

Error Message:
MockRAMDirectory: file "_0.tis" is still open: cannot overwrite

Stack Trace:
java.io.IOException: MockRAMDirectory: file "_0.tis" is still open: cannot 
overwrite
at 
org.apache.lucene.store.MockRAMDirectory.createOutput(MockRAMDirectory.java:221)
at 
org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:100)
at 
org.apache.lucene.index.TermInfosWriter.(TermInfosWriter.java:85)
at 
org.apache.lucene.index.FormatPostingsFieldsWriter.(FormatPostingsFieldsWriter.java:41)
at 
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:84)
at org.apache.lucene.index.TermsHash.flush(TermsHash.java:109)
at org.apache.lucene.index.DocInverter.flush(DocInverter.java:72)
at 
org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:59)
at 
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:589)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3299)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3264)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2007)
at org.apache.lucene.index.TestCrash.initIndex(TestCrash.java:51)
at 
org.apache.lucene.index.TestCrash.testWriterAfterCrash(TestCrash.java:77)
at 
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255)




Build Log (for compile errors):
[...truncated 8559 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2862) Track total term freq per term

2011-01-12 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2862:
---

Attachment: LUCENE-2862.patch

Patch, adds TermsEnum.totalTermFreq (returns -1 if codec doesn't impl it, or if 
omitTFAP is on) and Terms.getSumTotalTermFreq (= sum across all terms in this 
field).

> Track total term freq per term
> --
>
> Key: LUCENE-2862
> URL: https://issues.apache.org/jira/browse/LUCENE-2862
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-2862.patch
>
>
> Right now we track docFreq for each term (how many docs have the
> term), but the totalTermFreq (total number of occurrences of this
> term, ie sum of freq() for each doc that has the term) is also a
> useful stat (for flex scoring, PulsingCodec, etc.).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (LUCENE-2862) Track total term freq per term

2011-01-12 Thread Michael McCandless (JIRA)

Track total term freq per term
--

 Key: LUCENE-2862
 URL: https://issues.apache.org/jira/browse/LUCENE-2862
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


Right now we track docFreq for each term (how many docs have the
term), but the totalTermFreq (total number of occurrences of this
term, ie sum of freq() for each doc that has the term) is also a
useful stat (for flex scoring, PulsingCodec, etc.).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660
 ] 

Salman Akram edited comment on SOLR-1604 at 1/12/11 6:41 AM:
-

I integrated the patch and its working fine however, there were couple of 
issues. 

One is related to un-ordered proximity which seems to be fixed with the inOrder 
parameter but its not working for me (doesn't give any error but its still 
ordered). I will try to get the patch again coz I also merged it in early Nov 
so maybe it was applied after that.

The other issue is that although proximity search works with phrases BUT its 
not very accurate e.g. If I want to search   "a b" within 10 words of "c" the 
query would end up being "a b c"~10 but this will also return cases where "a" 
is not necessarily together with "b". Any scenario where these 3 words are 
within 10 words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like " "a b" c "~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
"Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!

  was (Author: salman741):
I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   "a b" within 10 words of "c" the query 
would end up being "a b c"~10 but this will also return cases where "a" is not 
necessarily together with "b". Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like " "a b" c "~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
"Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!
  
> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: Next
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660
 ] 

Salman Akram edited comment on SOLR-1604 at 1/12/11 6:37 AM:
-

I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   "a b" within 10 words of "c" the query 
would end up being "a b c"~10 but this will also return cases where "a" is not 
necessarily together with "b". Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like " "a b" c "~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
"Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!

  was (Author: salman741):
I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   "a b" within 10 words of "c" the query 
would end up being "a b c"~10 but this will also return cases where "a" is not 
necessarily together with "b". Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like " "a b" c "~10...

Thanks!
  
> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: Next
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980673#action_12980673
 ] 

Michael McCandless commented on LUCENE-2831:


{quote}
bq. Simon - heads up, I'm renaming TermState.copy -> TermState.copyFrom in 
LUCENE-2857.

this comment should be on LUCENE-2694, right?!
{quote}

Duh, right.  But it looks like you got the message anyway ;)

> Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
> -
>
> Key: LUCENE-2831
> URL: https://issues.apache.org/jira/browse/LUCENE-2831
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch
>
>
> Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, 
> boolean, boolean) we should / could revise the API and pass in a struct that 
> has parent reader, sub reader, ord of that sub. The ord mapping plus the 
> context with its parent would make several issues way easier. See 
> LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980672#action_12980672
 ] 

Michael McCandless commented on LUCENE-2751:


bq. Hmmm, it looks like the committed patch serializes loading of caches of 
multiple segments (for the same field?)

Ugh, you're right.  I had thought validate was "only" used after initial 
creation (eg, "typically" to add valid bits in), but in fact, create() calls 
validate().

Yonik do you have a patch in mind to fix the root cause correctly?

I have to say... the new FieldCache code is rather hairy.

> add LuceneTestCase.newSearcher()
> 
>
> Key: LUCENE-2751
> URL: https://issues.apache.org/jira/browse/LUCENE-2751
> Project: Lucene - Java
>  Issue Type: Test
>  Components: Build
>Reporter: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2751.patch, LUCENE-2751.patch
>
>
> Most tests in the search package don't care about what kind of searcher they 
> use.
> we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in 
> tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2859) Move Multi* and SlowMultiReaderWrapper to contrib

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980667#action_12980667
 ] 

Michael McCandless commented on LUCENE-2859:


We could fix that merging uses MultiDocs/AndPositionsEnum.  It's not 
particularly clean because we make Mapping* subclasses to remap the docIDs 
around deletions.  If, instead, we fixed PostingsConsumer.merge to take the 
subs' enums, instead of a single multi enum, then that method could go segment 
by segment.

> Move Multi* and SlowMultiReaderWrapper to contrib
> -
>
> Key: LUCENE-2859
> URL: https://issues.apache.org/jira/browse/LUCENE-2859
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Uwe Schindler
> Fix For: 4.0
>
>
> We should move SlowMultiReaderWrapper and all Multi* classes to contrib as it 
> should not be used anymore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-3.x - Build # 238 - Failure

2011-01-12 Thread Michael McCandless

Can we do something about these "false" failures?

They are failing because of this Hudson bug:

http://issues.hudson-ci.org/browse/HUDSON-7836

But the highish failure rate makes us look bad when people look at our
build stability... which is awful.

Mike

On Mon, Jan 10, 2011 at 6:50 PM, Apache Hudson Server
 wrote:
> Build: https://hudson.apache.org/hudson/job/Lucene-3.x/238/
>
> All tests passed
>
> Build Log (for compile errors):
> [...truncated 21034 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660
 ] 

Salman Akram commented on SOLR-1604:


I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   "a b" within 10 words of "c" the query 
would end up being "a b c"~10 but this will also return cases where "a" is not 
necessarily together with "b". Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like " "a b" c "~10...

Thanks!

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: Next
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene-Solr-tests-only-trunk - Build # 3688 - Failure

2011-01-12 Thread Simon Willnauer

This one is my fault... I missed some docBase assignments during the
last LUCENE-2831 commits. I will fix in a second

simon

On Wed, Jan 12, 2011 at 11:05 AM, Apache Hudson Server
 wrote:
> Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3688/
>
> 3 tests failed.
> REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testTermQuery
>
> Error Message:
> expected:<186> but was:<170>
>
> Stack Trace:
> junit.framework.AssertionFailedError: expected:<186> but was:<170>
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
>        at 
> org.apache.lucene.search.TestSubScorerFreqs.testTermQuery(TestSubScorerFreqs.java:151)
>
>
> REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery
>
> Error Message:
> expected:<186> but was:<170>
>
> Stack Trace:
> junit.framework.AssertionFailedError: expected:<186> but was:<170>
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
>        at 
> org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery(TestSubScorerFreqs.java:185)
>
>
> REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery
>
> Error Message:
> expected:<186> but was:<170>
>
> Stack Trace:
> junit.framework.AssertionFailedError: expected:<186> but was:<170>
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
>        at 
> org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery(TestSubScorerFreqs.java:215)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 2955 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (LUCENE-2861) Search doesn't return document via query

2011-01-12 Thread Zenoviy Veres (JIRA)

Search doesn't return document via query


 Key: LUCENE-2861
 URL: https://issues.apache.org/jira/browse/LUCENE-2861
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 3.0.3, 2.9.4, 2.9.1
 Environment: Doesn't depend on enviroment
Reporter: Zenoviy Veres


The query doesn't return document that contain all words from query in correct 
order.

The issue might be within mechanism how do SpanQuerys actually match results 
(http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/)

Please refer for details below. The example text wasn't passed through snowball 
analyzer, however the issue exists after analyzing too

Query:
(intend within 3 of message) within 5 of message within 3 of addressed.  

Text within document:
The contents of this e-mail message and
any attachments are intended solely for the
addressee(s) and may contain confidential
and/or legally privileged information. If you
are not the intended recipient of this message
or if this message has been addressed to you
in error, please immediately alert the sender
 by reply e-mail and then delete this message
and any attachments

Result query:

SpanNearQuery spanNear = new SpanNearQuery(new SpanQuery[] {
new SpanTermQuery(new Term(BODY, "intended")),
new SpanTermQuery(new Term(BODY, "message"))},
4,
false);
SpanNearQuery spanNear2 = new SpanNearQuery(new SpanQuery[] {spanNear, 
new SpanTermQuery(new Term(BODY, "message"))}, 5, false);
SpanNearQuery spanNear3 = new SpanNearQuery(new SpanQuery[] {spanNear2, 
new SpanTermQuery(new Term(BODY, "addressed"))}, 3, false);


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching

2011-01-12 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980656#action_12980656
 ] 

Michael McCandless commented on LUCENE-2860:


Ugh, my bad!  Thanks Shai.  Patch looks good.

> SegmentInfo.sizeInBytes ignore includeDocStore when caching
> ---
>
> Key: LUCENE-2860
> URL: https://issues.apache.org/jira/browse/LUCENE-2860
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2860.patch
>
>
> I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it 
> doesn't take into account 'includeDocStores'. I.e., if you call it once w/ 
> 'false' (sizeInBytes won't include the store files) and then with 'true' (or 
> vice versa), you won't get the right sizeInBytes (it won't re-compute, with 
> the store files).
> I'll fix and add a test case demonstrating the bug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

2011-01-12 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2831:


Attachment: LUCENE-2831-nuke-SolrIndexReader.patch

this patch cuts over all function query stuff to AtomicReaderContext in solr & 
lucene. It also nukes SolrIndexReader entirely - yay!! :)
I thinks somebody should give this patch a glance though, especially from the 
solr perspective although all tests pass. 

I had to make the IndexSearcher(ReaderContext, AtomicContext...) ctor public 
which is ok I think and I added a new already deprecated method to ValueSource 
in lucene land to make transition easier.

if nobody objects I will commit later today

> Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
> -
>
> Key: LUCENE-2831
> URL: https://issues.apache.org/jira/browse/LUCENE-2831
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831.patch, LUCENE-2831.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch, 
> LUCENE-2831_transition_to_atomicCtx.patch
>
>
> Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, 
> boolean, boolean) we should / could revise the API and pass in a struct that 
> has parent reader, sub reader, ord of that sub. The ord mapping plus the 
> context with its parent would make several issues way easier. See 
> LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980649#action_12980649
 ] 

Earwin Burrfoot commented on LUCENE-2793:
-

What's with ongoing crazyness? :)

bq. DirectIOLinuxDirectory
First you introduce a kind of directory that is utterly useless except certain 
special situations. Then, instead of fixing the directory/folding its code 
somewhere normal, you try to workaround by switching between directories. 
What's the point of using abstract classes or interfaces, if you leak their 
implementation's logic all over the place?
Or making DIOLD wrap something. Yeah! Wrap my RAMDir!

bq. bufferSize
This value is only meaningful to a certain subset of Directory implementations. 
So the only logical place we want to see this value set - is these very impls.
Sample code:
{code}
Directory ramDir = new RAMDirectory();
ramDir.createIndexInput(name, context);
// See, ma? No bufferSizes, they are pointless for RAMDir

Directory fsDir = new NIOFSDirectory();
fsDir.setBufferSize(IOContext.NORMAL_READ, 1024);
fsDir.setBufferSize(IOContext.MERGE, 4096);
fsDir.createIndexInput(name, context)
// See, ma? The only one who's really concerned with 'actual' buffer size is 
this concrete Directory impl
// All client code is only concerned with the context.
// It's NIOFSDirectory's business to give meaningful interpretation for 
IOContext and assign the buffer sizes.
{code}

You don't need custom Directory impls to make DIOLD work, you should freakin' 
fix it.
The proper way is to test out the things, and then move DirectIO code to the 
only place it makes sense in - FSDir? Probably make it switch on/off-able, 
maybe not.

You don't need custom Directory impls to set buffer sizes (neither cast to 
BufferedIndexInput!), you should add the setting to these Directories, which 
make sense of it.

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene-Solr-tests-only-trunk - Build # 3688 - Failure

2011-01-12 Thread Apache Hudson Server

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3688/

3 tests failed.
REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testTermQuery

Error Message:
expected:<186> but was:<170>

Stack Trace:
junit.framework.AssertionFailedError: expected:<186> but was:<170>
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
at 
org.apache.lucene.search.TestSubScorerFreqs.testTermQuery(TestSubScorerFreqs.java:151)


REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery

Error Message:
expected:<186> but was:<170>

Stack Trace:
junit.framework.AssertionFailedError: expected:<186> but was:<170>
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
at 
org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery(TestSubScorerFreqs.java:185)


REGRESSION:  org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery

Error Message:
expected:<186> but was:<170>

Stack Trace:
junit.framework.AssertionFailedError: expected:<186> but was:<170>
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049)
at 
org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery(TestSubScorerFreqs.java:215)




Build Log (for compile errors):
[...truncated 2955 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Question about null request instance in CloudSolrServer

2011-01-12 Thread Mark Miller

Hey Stan -

Looks like this may be a problem - when I get chance I will take a look - but 
I'd guess, based on the error, you probably should file a jira issue.

The best thing to do in these cases is email the user list or dev list, rather 
than me individually - it is best to have the history of this issue public for 
others that may have or will run into the same problem. I don't have the solr 
user list available under autocomplete, and as it's late and i don't want to 
forget/lost this email, I'm cc'ing the dev list to make this issue visible.

- Mark

On Jan 11, 2011, at 2:36 PM, sburnitt wrote:

> Hello Mr. Miller,
> 
> I hope I am not contacting you the wrong way, but I need some help with 
> SolrCloud (solr-trunk)
> (I've spent two days googling, studing src code, and asking questions on 
> #solr IRC... Nada.)
> 
> I am trying to use your org.apache.solr.client.solrj.impl.CloudSolrServer 
> implementation, but when I try to add(docs), I get an NPE on line 105:
> 
>   String collection = request.getParams().get("collection", 
> defaultCollection);
> 
> It seems the request instance is null and I cannot figure out why.
> 
> I will skip the details here in case email is not the place to discuss this 
> further.
> Do you have the time to help me out?  Should I add an issue to JIRA first?   
> (I just signed up and not sure I would be authorized.)
> 
> Regards,
> Stan Burnitt

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching

2011-01-12 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2860:
---

Attachment: LUCENE-2860.patch

Patch fixes the bug and adds test case.

> SegmentInfo.sizeInBytes ignore includeDocStore when caching
> ---
>
> Key: LUCENE-2860
> URL: https://issues.apache.org/jira/browse/LUCENE-2860
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2860.patch
>
>
> I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it 
> doesn't take into account 'includeDocStores'. I.e., if you call it once w/ 
> 'false' (sizeInBytes won't include the store files) and then with 'true' (or 
> vice versa), you won't get the right sizeInBytes (it won't re-compute, with 
> the store files).
> I'll fix and add a test case demonstrating the bug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-12 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980626#action_12980626
 ] 

Shai Erera commented on LUCENE-2856:


I see. I'm ok with both then.

> Create IndexWriter event listener, specifically for merges
> --
>
> Key: LUCENE-2856
> URL: https://issues.apache.org/jira/browse/LUCENE-2856
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
> Attachments: LUCENE-2856.patch
>
>
> The issue will allow users to monitor merges occurring within IndexWriter 
> using a callback notifier event listener.  This can be used by external 
> applications such as Solr to monitor large segment merges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980625#action_12980625
 ] 

Shai Erera commented on LUCENE-2793:


bq. No you dont... you can also call the .setBufferSize like the skiplist 
reader does.

I'd still need to impl my Directory though right? That's the overkill I'm 
trying to avoid.

But, I've been thinking about how would a merge/search code, which can only 
tell the Directory it's in SEARCH / MERGE context, get the buffer size the 
application wanted to use in that context. I don't think it has a way to do so 
without either using a static class, something we try to avoid, or propagating 
those settings down everywhere, which does not make sense either.

So, Robert, I think you're right -- bufferSize should not exist on IOContext. A 
custom Directory impl seems unavoided. So I think it'd be good if we can create 
a BufferedDirectoryWrapper which wraps a Directory and offers a 
BufferedIndexInput/OutputWrapper which delegate the necessary calls to the 
wrapped Directory. We've found ourselves needing to implement that kind of 
Directory several times already, and it'd be good if Lucene can offer one.

That Directory would then take IOContext -> bufferSize map/properties/whatever 
and can take that into account in openInput/createOutput.

If users will need to impl Directory wrapping, if they want to control buffer 
sizes, I suggest we make that as painless as possible.

> Directory createOutput and openInput should take an IOContext
> -
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Store
>Reporter: Michael McCandless
> Attachments: LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

2011-01-12 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980622#action_12980622
 ] 

Koji Sekiguchi commented on SOLR-2282:
--

bq. When I now run the test, I'm getting a different exception, which looks 
like some misconfiguration of the test itself: 

Confirmed.

contrib/clustering/build.xml seems to be changed in SOLR-2299, but I'm not sure 
the cause of the failure.

> Distributed Support for Search Result Clustering
> 
>
> Key: SOLR-2282
> URL: https://issues.apache.org/jira/browse/SOLR-2282
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - Clustering
>Affects Versions: 1.4, 1.4.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, 
> SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to 
> incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

93 matches

Mail list logo