[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
[ https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981160#action_12981160 ] Shai Erera commented on LUCENE-2863: If you want to update documents, you should store them in their entirety somewhere (either in a Lucene index as stored fields, all of them), a DB or someplace else. This is how updateDocument currently works. > Updating a documenting looses its fields that only indexed, also NumericField > tries are completely lost > --- > > Key: LUCENE-2863 > URL: https://issues.apache.org/jira/browse/LUCENE-2863 > Project: Lucene - Java > Issue Type: Bug > Components: Store >Affects Versions: 3.0.2, 3.0.3 > Environment: WindowsXP, Java1.6.20 using a RamDirectory >Reporter: Tamas Sandor >Priority: Blocker > > I have a code snippet (see below) which creates a new document with standard > (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then > it updates the document via adding a new string field. The result is that all > the fields that are not stored but indexed-only and especially NumericFields > the trie tokens are completly lost from index after update or delete/add. > {code:java} > Directory ramDir = new RamDirectory(); > IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), > MaxFieldLength.UNLIMITED); > Document doc = new Document(); > doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(51.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-0.08913399651646614d)); > writer.addDocument(doc); > doc = new Document(); > doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(101.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-100.08913399651646614d)); > writer.addDocument(doc); > Term t = new Term("ID", "HO1234"); > Query q = new TermQuery(t); > IndexSearcher seacher = new IndexSearcher(writer.getReader()); > TopDocs hits = seacher.search(q, 1); > if (hits.scoreDocs.length > 0) { > Document ndoc = seacher.doc(hits.scoreDocs[0].doc); > ndoc.add(new Field("FINAL", "FINAL", Store.YES, > Index.NOT_ANALYZED_NO_NORMS)); > writer.updateDocument(t, ndoc); > // writer.deleteDocuments(q); > // writer.addDocument(ndoc); > } else { > LOG.info("Couldn't find the document via the query"); > } > seacher = new IndexSearcher(writer.getReader()); > hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); > LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 > writer.close(); > {code} > And I have a boundingbox query based on *NumericRangeQuery*. After the > document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Closed: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
[ https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera closed LUCENE-2863. -- Resolution: Not A Problem This is not the sort of discussions we should be having in JIRA - that's why we have the user list. Closing as it's not a bug, nor a feature/enhancement proposal. > Updating a documenting looses its fields that only indexed, also NumericField > tries are completely lost > --- > > Key: LUCENE-2863 > URL: https://issues.apache.org/jira/browse/LUCENE-2863 > Project: Lucene - Java > Issue Type: Bug > Components: Store >Affects Versions: 3.0.2, 3.0.3 > Environment: WindowsXP, Java1.6.20 using a RamDirectory >Reporter: Tamas Sandor >Priority: Blocker > > I have a code snippet (see below) which creates a new document with standard > (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then > it updates the document via adding a new string field. The result is that all > the fields that are not stored but indexed-only and especially NumericFields > the trie tokens are completly lost from index after update or delete/add. > {code:java} > Directory ramDir = new RamDirectory(); > IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), > MaxFieldLength.UNLIMITED); > Document doc = new Document(); > doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(51.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-0.08913399651646614d)); > writer.addDocument(doc); > doc = new Document(); > doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(101.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-100.08913399651646614d)); > writer.addDocument(doc); > Term t = new Term("ID", "HO1234"); > Query q = new TermQuery(t); > IndexSearcher seacher = new IndexSearcher(writer.getReader()); > TopDocs hits = seacher.search(q, 1); > if (hits.scoreDocs.length > 0) { > Document ndoc = seacher.doc(hits.scoreDocs[0].doc); > ndoc.add(new Field("FINAL", "FINAL", Store.YES, > Index.NOT_ANALYZED_NO_NORMS)); > writer.updateDocument(t, ndoc); > // writer.deleteDocuments(q); > // writer.addDocument(ndoc); > } else { > LOG.info("Couldn't find the document via the query"); > } > seacher = new IndexSearcher(writer.getReader()); > hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); > LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 > writer.close(); > {code} > And I have a boundingbox query based on *NumericRangeQuery*. After the > document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
[ https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981156#action_12981156 ] Simon Willnauer commented on LUCENE-2831: - I committed the latest patch in revision 1058431, I think we are done here - yay! > Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context > - > > Key: LUCENE-2831 > URL: https://issues.apache.org/jira/browse/LUCENE-2831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, > LUCENE-2831-nuke-SolrIndexReader.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch > > > Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, > boolean, boolean) we should / could revise the API and pass in a struct that > has parent reader, sub reader, ord of that sub. The ord mapping plus the > context with its parent would make several issues way easier. See > LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
[ https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981154#action_12981154 ] Tamas Sandor commented on LUCENE-2863: -- Yeah, but how can I add the indexed fields back (tries of _LAT_, _LNG_ and the _PATTERN_ field)? {{document.getFields()}} would give my old fields back in the form on {{List}} but the comment says: {quote} Note that fields which are not stored are not available in documents retrieved from the index, e.g. Searcher.doc(int) or IndexReader.document(int). {quote} So this won't work either: {code:java} doc = searcher.doc(hits.scoreDocs[0].doc); Document ndoc = new Document(); for (Fieldable field : doc.getFields()) { ndoc.add(field); } ndoc.add(new Field("FINAL", "FINAL", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); writer.updateDocument(t, ndoc); {code} > Updating a documenting looses its fields that only indexed, also NumericField > tries are completely lost > --- > > Key: LUCENE-2863 > URL: https://issues.apache.org/jira/browse/LUCENE-2863 > Project: Lucene - Java > Issue Type: Bug > Components: Store >Affects Versions: 3.0.2, 3.0.3 > Environment: WindowsXP, Java1.6.20 using a RamDirectory >Reporter: Tamas Sandor >Priority: Blocker > > I have a code snippet (see below) which creates a new document with standard > (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then > it updates the document via adding a new string field. The result is that all > the fields that are not stored but indexed-only and especially NumericFields > the trie tokens are completly lost from index after update or delete/add. > {code:java} > Directory ramDir = new RamDirectory(); > IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), > MaxFieldLength.UNLIMITED); > Document doc = new Document(); > doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(51.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-0.08913399651646614d)); > writer.addDocument(doc); > doc = new Document(); > doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(101.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-100.08913399651646614d)); > writer.addDocument(doc); > Term t = new Term("ID", "HO1234"); > Query q = new TermQuery(t); > IndexSearcher seacher = new IndexSearcher(writer.getReader()); > TopDocs hits = seacher.search(q, 1); > if (hits.scoreDocs.length > 0) { > Document ndoc = seacher.doc(hits.scoreDocs[0].doc); > ndoc.add(new Field("FINAL", "FINAL", Store.YES, > Index.NOT_ANALYZED_NO_NORMS)); > writer.updateDocument(t, ndoc); > // writer.deleteDocuments(q); > // writer.addDocument(ndoc); > } else { > LOG.info("Couldn't find the document via the query"); > } > seacher = new IndexSearcher(writer.getReader()); > hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); > LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 > writer.close(); > {code} > And I have a boundingbox query based on *NumericRangeQuery*. After the > document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr-3.x - Build # 226 - Failure
Build: https://hudson.apache.org/hudson/job/Solr-3.x/226/ All tests passed Build Log (for compile errors): [...truncated 20277 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-445: Attachment: solr-445.xml SOLR-445.patch Here's a cut at an improvement at least. The attached XML file contains an packet with a number of documents illustrating a number of errors. The xml file can be POSTed Solr to index via the post.jar file so you can see the output. This patch attempts to report back to the user the following for each document that failed: 1> the ordinal position in the file where the error occurred (e.g. the first, second, etc tag). 2> the if available. 3> the error. The general idea is to accrue the errors in a StringBuilder and eventually re-throw the error after processing as far as possible. Issues: 1> the reported format in the log file is kind of hard to read. I pipe-delimited the various tags, but they run together in a Windows DOS window. What happens on Unix I'm not quite sure. Suggestions welcome. 2> From the original post, rolling this back will be tricky. Very tricky. The autocommit feature makes it indeterminate what's been committed to the index, so I don't know how to even approach rolling back everything. 3> The intent here is to give the user a clue where to start when figuring out what document(s) failed so they don't have to guess. 4> Tests fail, but I have no clue why. I checked out a new copy of trunk and that fails as well, so I don't think that this patch is the cause of the errors. But let's not commit this until we can be sure. 5> What do you think about limiting the number of docs that fail before quitting? One could imagine some ratio (say 10%) that have to fail before quitting (with some safeguards, like don't bother calculating the ratio until 20 docs had been processed or...). Or an absolute number. Should this be a parameter? Or hard-coded? The assumption here is that if 10 (or 100 or..) docs fail, there's something pretty fundamentally wrong and it's a waste to keep on. I don't have any strong feeling here, I can argue it either way 6> Sorry, all, but I reflexively hit the reformat keystrokes so the raw patch may be hard to read. But I'm pretty well in the camp that you *have* to reformat as you go or the code will be held hostage to the last person who *didn't* format properly. I'm pretty sure I'm using the right codestyle.xml file, but let me know if not. 7> I doubt that this has any bearing on, say, SolrJ indexing. Should that be another bug (or is there one already)? Anybody got a clue where I'd look for that since I'm in the area anyway? Erick > XmlUpdateRequestHandler bad documents mid batch aborts rest of batch > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Erick Erickson > Fix For: Next > > Attachments: SOLR-445.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-trunk - Build # 1424 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1424/ All tests passed Build Log (for compile errors): [...truncated 16732 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-445: --- Assignee: Erick Erickson > XmlUpdateRequestHandler bad documents mid batch aborts rest of batch > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Erick Erickson > Fix For: Next > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981100#action_12981100 ] Koji Sekiguchi commented on SOLR-2282: -- Thanks, Robert! I committed the fix. (Still I couldn't reproduce the hudson problem on my mac if I comment out @Ignore in DistributedClusteringComponentTest.java.) > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2723) Speed up Lucene's low level bulk postings read API
[ https://issues.apache.org/jira/browse/LUCENE-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981083#action_12981083 ] Robert Muir commented on LUCENE-2723: - I merged us up to yesterday (1052991:1057836), but stopped at the Pulsing codec rewrite :) Mike can you assist in merging r1057897? Besides requiring a lot of beer there is a danger of screwing it up, since we have to re-implement its bulk postings enum. > Speed up Lucene's low level bulk postings read API > -- > > Key: LUCENE-2723 > URL: https://issues.apache.org/jira/browse/LUCENE-2723 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-2723-termscorer.patch, > LUCENE-2723-termscorer.patch, LUCENE-2723-termscorer.patch, > LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, > LUCENE-2723.patch, LUCENE-2723_bulkvint.patch, LUCENE-2723_facetPerSeg.patch, > LUCENE-2723_facetPerSeg.patch, LUCENE-2723_openEnum.patch, > LUCENE-2723_termscorer.patch, LUCENE-2723_wastedint.patch > > > Spinoff from LUCENE-1410. > The flex DocsEnum has a simple bulk-read API that reads the next chunk > of docs/freqs. But it's a poor fit for intblock codecs like FOR/PFOR > (from LUCENE-1410). This is not unlike sucking coffee through those > tiny plastic coffee stirrers they hand out airplanes that, > surprisingly, also happen to function as a straw. > As a result we see no perf gain from using FOR/PFOR. > I had hacked up a fix for this, described at in my blog post at > http://chbits.blogspot.com/2010/08/lucene-performance-with-pfordelta-codec.html > I'm opening this issue to get that work to a committable point. > So... I've worked out a new bulk-read API to address performance > bottleneck. It has some big changes over the current bulk-read API: > * You can now also bulk-read positions (but not payloads), but, I > have yet to cutover positional queries. > * The buffer contains doc deltas, not absolute values, for docIDs > and positions (freqs are absolute). > * Deleted docs are not filtered out. > * The doc & freq buffers need not be "aligned". For fixed intblock > codecs (FOR/PFOR) they will be, but for varint codecs (Simple9/16, > Group varint, etc.) they won't be. > It's still a work in progress... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2314) replicate/index.jsp UI does not honor java system properties enable.master, enable.slave
[ https://issues.apache.org/jira/browse/SOLR-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981080#action_12981080 ] will milspec commented on SOLR-2314: Correction: Where the bug says "-Dsolr.enable.master" it should have said '-Denable.master' (similarly for slave) > replicate/index.jsp UI does not honor java system properties enable.master, > enable.slave > > > Key: SOLR-2314 > URL: https://issues.apache.org/jira/browse/SOLR-2314 > Project: Solr > Issue Type: Bug > Components: web gui >Affects Versions: 1.4.1 > Environment: jdk 1.6.0.23 ; both jetty and jboss/tomcat. >Reporter: will milspec >Priority: Minor > > Summary: > == > - Admin UI replication/index.jsp checks for master or slave with the > following code: >if ("true".equals(detailsMap.get("isSlave"))) > - if slave, replication/index.jsp displays the "Master" and "Poll > Intervals", etc. sections (everything up to "Cores") > - if false, replication/index.jsp does not display the "Master", "Poll > Intervals" sections > -This "slave check/UI difference" works correctly if the solrconfig.xml has a > "slave" but not "master" section or vice versa > Expected results: > == > Same UI difference would occur in the following scenario: >a) solrconfig.xml has both master and slave entries >b) use java.properties (-Dsolr.enable.master -Dsolr.enable.slave) to set > "master" or "slave" at runtime > *OR* > c) use solrcore.properties to set "master" and "slave" at runtime > Actual results: > == > If solrconfig.xml has both master and slave entries, replication/index.jsp > shows both "master" and "slave" section regardless of system.properties or > solrcore.properties -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2314) replicate/index.jsp UI does not honor java system properties enable.master, enable.slave
replicate/index.jsp UI does not honor java system properties enable.master, enable.slave Key: SOLR-2314 URL: https://issues.apache.org/jira/browse/SOLR-2314 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4.1 Environment: jdk 1.6.0.23 ; both jetty and jboss/tomcat. Reporter: will milspec Priority: Minor Summary: == - Admin UI replication/index.jsp checks for master or slave with the following code: if ("true".equals(detailsMap.get("isSlave"))) - if slave, replication/index.jsp displays the "Master" and "Poll Intervals", etc. sections (everything up to "Cores") - if false, replication/index.jsp does not display the "Master", "Poll Intervals" sections -This "slave check/UI difference" works correctly if the solrconfig.xml has a "slave" but not "master" section or vice versa Expected results: == Same UI difference would occur in the following scenario: a) solrconfig.xml has both master and slave entries b) use java.properties (-Dsolr.enable.master -Dsolr.enable.slave) to set "master" or "slave" at runtime *OR* c) use solrcore.properties to set "master" and "slave" at runtime Actual results: == If solrconfig.xml has both master and slave entries, replication/index.jsp shows both "master" and "slave" section regardless of system.properties or solrcore.properties -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2846) omitTF is viral, but omitNorms is anti-viral.
[ https://issues.apache.org/jira/browse/LUCENE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2846. - Resolution: Fixed Committed revision 1058367. I deprecated the dangerous setNorm(float) method in 3.x in revision 1058370, instead pointing at setNorm(byte) and using Similarity.encodeNormValue(), so you can ensure your Similarity is always used (not Similarity.getDefault) > omitTF is viral, but omitNorms is anti-viral. > - > > Key: LUCENE-2846 > URL: https://issues.apache.org/jira/browse/LUCENE-2846 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-2846.patch, LUCENE-2846.patch, LUCENE-2846.patch, > LUCENE-2846.patch > > > omitTF is viral. if you add document 1 with field "foo" as omitTF, then > document 2 has field "foo" without omitTF, they are both treated as omitTF. > but omitNorms is the opposite. if you have a million documents with field > "foo" with omitNorms, then you add just one document without omitting norms, > now you suddenly have a million 'real norms'. > I think it would be good for omitNorms to be viral too, just for consistency, > and also to prevent huge byte[]'s. > but another option is to make omitTF anti-viral, which is more "schemaless" i > guess. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-3.x - Build # 240 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/240/ All tests passed Build Log (for compile errors): [...truncated 21057 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981019#action_12981019 ] Jason Rutherglen commented on LUCENE-2856: -- I separated out a ReaderListener because it's tied to the ReaderPool which eventually will exist external to IW. > Create IndexWriter event listener, specifically for merges > -- > > Key: LUCENE-2856 > URL: https://issues.apache.org/jira/browse/LUCENE-2856 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Affects Versions: 4.0 >Reporter: Jason Rutherglen > Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch > > > The issue will allow users to monitor merges occurring within IndexWriter > using a callback notifier event listener. This can be used by external > applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[ACCOUNCE] Lucene.Net proposal submitted
All, This is a brief note to announce that the Lucene.Net proposal to move back to the Incubator was submitted today on the gene...@incubator.apache.org mailing list. To follow the discussion, please sign up for that mailing list or view the archives (with slight delay) at: http://mail-archives.apache.org/mod_mbox/incubator-general/ Thanks, Troy
Re: call python from java - what strategy do you use?
On Wed, 12 Jan 2011, Roman Chyla wrote: Oh, nevermind I found it out: It is loaded by python, so PYTHONPATH or other ways must be used. Also i had to change exports: inside sorlpie_java/__init__.py i added line: from emql import Emql Then, in java i can do: PythonVM vm = PythonVM.start("sorlpie_java"); EMQL em = (EMQL)vm.instantiate("solrpie_java", "Emql"); Yep, that's the one ! Andi.. em.javaTestPrint(); em.pythonTestPrint(); System.out.println(em.emql_status()); And I get: java is printing some status funny the pythonTestPrint() never prints anything Cheers, roman On Wed, Jan 12, 2011 at 11:20 PM, Roman Chyla wrote: Hi Andi, Thanks for the help, now I was able to run the java and loaded PythonVM. I then built the python egg, after a bit of fiddling with parameters, it seems ok. I can import the jcc wrapped python class and call it: In [1]: from solrpie_java import emql In [2]: em = emql.Emql() In [3]: em.javaTestPrint() java is printing In [4]: em.pythonTestPrint() just a test But I haven't found out how to call the same from java. The egg is built fine, it is named solrpie_java and contains one python module: == from solrpie_java import initVM, CLASSPATH, EMQL initVM(CLASSPATH) class Emql(EMQL): ''' classdocs ''' def __init__(self): super(Emql, self).__init__() print '__init__' def init(self, me): print self, me return 'init' def emql_refresh(self, tid, type): print self, tid, type return 'refresh' def emql_status(self): return "some status" def pythonTestPrint(self): print 'just a test' The corresponding java class looks like this: public class EMQL { private long pythonObject; public EMQL() { } public void pythonExtension(long pythonObject) { this.pythonObject = pythonObject; } public long pythonExtension() { return this.pythonObject; } public void finalize() throws Throwable { pythonDecRef(); } public void javaTestPrint() { System.out.println("java is printing"); } public native void pythonDecRef(); // the methods implemented in python public native String init(EMQL me); public native String emql_refresh(String tid, String type); public native String emql_status(); public native void pythonTestPrint(); } === I tried running it as: PythonVM vm = PythonVM.start("sorlpie_java"); EMQL em = new EMQL(); em.javaTestPrint(); em.pythonTestPrint(); I get this: java is printing Exception in thread "main" java.lang.UnsatisfiedLinkError: rca.pythonvm.EMQL.pythonTestPrint()V at rca.pythonvm.EMQL.pythonTestPrint(Native Method) at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:60) at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148) I understand that java cannot find the linked c++ method, but I don't know how to fix that. If i try: PythonVM vm = PythonVM.start("sorlpie_java"); Object m = vm.instantiate("emql", "Emql"); I get: org.apache.jcc.PythonException: No module named emql ImportError: No module named emql at org.apache.jcc.PythonVM.instantiate(Native Method) at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:56) at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148) I tried various combinations of instanatiation, and setting the classpatt or -Djava.library.path But no success. What am I doing wrong? Thank you, roman On Wed, Jan 12, 2011 at 7:55 PM, Andi Vajda wrote: On Wed, 12 Jan 2011, Roman Chyla wrote: Hi Andi, all, I tried to implement the PythonVM wrapping on Mac 10.6, with JDK 1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is the standard Python distributed with MacOsX When I try to run the java, it throws an error when it gets to: static { System.loadLibrary("jcc"); } I am getting this error: Exception in thread "main" java.lang.UnsatisfiedLinkError: /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib: Symbol not found: _PyExc_RuntimeError Referenced from: That's because Python's shared library wasn't found. The reason is that, by default, Python's shared lib not on JCC's link line because normally JCC is loaded into a Python process and the dynamic linker thus finds the symbols needed inside the process. Here, since you're not starting inside a Python process, you need to add '-framework Python' to JCC's LFLAGS in setup.py so that the dynamic linker can find the Python VM shared lib and load it. Andi.. /Library/Python/2.6/site-packages/JCC-2
Re: call python from java - what strategy do you use?
Hi Roman, On Wed, 12 Jan 2011, Roman Chyla wrote: Thanks for the help, now I was able to run the java and loaded PythonVM. I then built the python egg, after a bit of fiddling with parameters, it seems ok. I can import the jcc wrapped python class and call it: In [1]: from solrpie_java import emql Why are you calling your class EMQL ? (this name was just an example culled from my code). In [2]: em = emql.Emql() In [3]: em.javaTestPrint() java is printing In [4]: em.pythonTestPrint() just a test But I haven't found out how to call the same from java. Ah, yes, I forgot to tell you how to pull that in. In Java, you import that 'EMQL' java class and instantiate it by way of the PythonVM instance's instantiate() call: import org.blah.blah.EMQL; import org.apache.jcc.PythonVM; . PythonVM vm = PythonVM.get(); emql = (EMQL) vm.instantiate("jemql.emql", "emql"); ... call method on emql instance just created ... The instantiate("foo", "bar") method in effect asks Python to run "from foo import bar" "return bar()" Andi..
Re: call python from java - what strategy do you use?
Oh, nevermind I found it out: It is loaded by python, so PYTHONPATH or other ways must be used. Also i had to change exports: inside sorlpie_java/__init__.py i added line: from emql import Emql Then, in java i can do: PythonVM vm = PythonVM.start("sorlpie_java"); EMQL em = (EMQL)vm.instantiate("solrpie_java", "Emql"); em.javaTestPrint(); em.pythonTestPrint(); System.out.println(em.emql_status()); And I get: java is printing some status funny the pythonTestPrint() never prints anything Cheers, roman On Wed, Jan 12, 2011 at 11:20 PM, Roman Chyla wrote: > Hi Andi, > > Thanks for the help, now I was able to run the java and loaded > PythonVM. I then built the python egg, after a bit of fiddling with > parameters, it seems ok. I can import the jcc wrapped python class and > call it: > > In [1]: from solrpie_java import emql > > In [2]: em = emql.Emql() > > In [3]: em.javaTestPrint() > java is printing > > In [4]: em.pythonTestPrint() > just a test > > But I haven't found out how to call the same from java. > > The egg is built fine, it is named solrpie_java and contains one python > module: > > == > > from solrpie_java import initVM, CLASSPATH, EMQL > > initVM(CLASSPATH) > > > class Emql(EMQL): > ''' > classdocs > ''' > > def __init__(self): > super(Emql, self).__init__() > print '__init__' > > > def init(self, me): > print self, me > return 'init' > def emql_refresh(self, tid, type): > print self, tid, type > return 'refresh' > def emql_status(self): > return "some status" > > def pythonTestPrint(self): > print 'just a test' > > > The corresponding java class looks like this: > > > public class EMQL { > > private long pythonObject; > > public EMQL() > { > } > > public void pythonExtension(long pythonObject) > { > this.pythonObject = pythonObject; > } > public long pythonExtension() > { > return this.pythonObject; > } > > public void finalize() > throws Throwable > { > pythonDecRef(); > } > > public void javaTestPrint() { > System.out.println("java is printing"); > } > > public native void pythonDecRef(); > > // the methods implemented in python > public native String init(EMQL me); > public native String emql_refresh(String tid, String type); > public native String emql_status(); > > public native void pythonTestPrint(); > > > } > > === > > I tried running it as: > > PythonVM vm = PythonVM.start("sorlpie_java"); > EMQL em = new EMQL(); > em.javaTestPrint(); > em.pythonTestPrint(); > > I get this: > > java is printing > Exception in thread "main" java.lang.UnsatisfiedLinkError: > rca.pythonvm.EMQL.pythonTestPrint()V > at rca.pythonvm.EMQL.pythonTestPrint(Native Method) > at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:60) > at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148) > > I understand that java cannot find the linked c++ method, but I don't > know how to fix that. > If i try: > > PythonVM vm = PythonVM.start("sorlpie_java"); > Object m = vm.instantiate("emql", "Emql"); > > I get: > > org.apache.jcc.PythonException: No module named emql > ImportError: No module named emql > > at org.apache.jcc.PythonVM.instantiate(Native Method) > at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:56) > at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:148) > > I tried various combinations of instanatiation, and setting the > classpatt or -Djava.library.path > But no success. What am I doing wrong? > > Thank you, > > roman > > > > On Wed, Jan 12, 2011 at 7:55 PM, Andi Vajda wrote: >> >> On Wed, 12 Jan 2011, Roman Chyla wrote: >> >>> Hi Andi, all, >>> >>> I tried to implement the PythonVM wrapping on Mac 10.6, with JDK >>> 1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is >>> the standard Python distributed with MacOsX >>> >>> When I try to run the java, it throws an error when it gets to: >>> >>> static { >>> System.loadLibrary("jcc"); >>> } >>> >>> I am getting this error: >>> >>> Exception in thread "main" java.lang.UnsatisfiedLinkError: >>> >>> /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib: >>> Symbol not found: _PyExc_RuntimeError Referenced from: >> >> That's because Python's shared library wasn't found. The reason is that, by >> default, Python's shared lib not on JCC's link line because normally JCC is >> loaded into a Python process and the dynamic linker thus finds the symbols >> needed inside th
[jira] Assigned: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-2312: - Assignee: Mark Miller > CloudSolrServer -- calling add(Collection docs) throws NPE. > -- > > Key: SOLR-2312 > URL: https://issues.apache.org/jira/browse/SOLR-2312 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.0 > Environment: Mac OSX v10.5.8 > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) >Reporter: Stan Burnitt >Assignee: Mark Miller >Priority: Critical > Fix For: 4.0 > > > Cannot index documents using the CloudSolrServer. > Below is a code snippet that reproduces the error. > {code:borderStyle=solid} > @Test > public void jiraTestCase() { > CloudSolrServer solrj = null; > > try { > solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181"); > // Also tried creating CloudSolrServer using > alternative contstuctor below... > // public CloudSolrServer(String zkHost, > LBHttpSolrServer lbServer) > // > // LBHttpSolrServer lbHttpSolrServer = new > LBHttpSolrServer("http://solr.localdomain:8983/solr";); > // solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); > // > // (Same result -- NPE @ line 105 in > CloudSolrServer.java) > solrj.setDefaultCollection("your-collection"); > solrj.setZkClientTimeout(5000); > solrj.setZkConnectTimeout(5000); > final Collection batch = new > ArrayList(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 1L, 1.0f); > doc.addField("title", "Document A"); > doc.addField("description", "Test document"); > batch.add(doc); > doc = new SolrInputDocument(); > doc.addField("id", 2L, 1.0f); > doc.addField("title", "Document B"); > doc.addField("description", "Another test > document"); > batch.add(doc); > solrj.add(batch); > } catch (Exception e) { > log.error(e.getMessage(), e); > Assert.fail("java.lang.NullPointerException: > null \n" > + " at > org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) > \n" > + " Line 105: NULL request object here > --> String collection = request.getParams().get(\"collection\", > defaultCollection);"); > } finally { > solrj.close(); > } > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
[ https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980965#action_12980965 ] Earwin Burrfoot commented on LUCENE-2863: - updateDocument() is an atomic version of deleteDocument() + addDocument(), nothing more and there's nothing surprising you lose your fields if you delete the doc and don't add them back later. > Updating a documenting looses its fields that only indexed, also NumericField > tries are completely lost > --- > > Key: LUCENE-2863 > URL: https://issues.apache.org/jira/browse/LUCENE-2863 > Project: Lucene - Java > Issue Type: Bug > Components: Store >Affects Versions: 3.0.2, 3.0.3 > Environment: WindowsXP, Java1.6.20 using a RamDirectory >Reporter: Tamas Sandor >Priority: Blocker > > I have a code snippet (see below) which creates a new document with standard > (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then > it updates the document via adding a new string field. The result is that all > the fields that are not stored but indexed-only and especially NumericFields > the trie tokens are completly lost from index after update or delete/add. > {code:java} > Directory ramDir = new RamDirectory(); > IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), > MaxFieldLength.UNLIMITED); > Document doc = new Document(); > doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(51.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-0.08913399651646614d)); > writer.addDocument(doc); > doc = new Document(); > doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(101.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-100.08913399651646614d)); > writer.addDocument(doc); > Term t = new Term("ID", "HO1234"); > Query q = new TermQuery(t); > IndexSearcher seacher = new IndexSearcher(writer.getReader()); > TopDocs hits = seacher.search(q, 1); > if (hits.scoreDocs.length > 0) { > Document ndoc = seacher.doc(hits.scoreDocs[0].doc); > ndoc.add(new Field("FINAL", "FINAL", Store.YES, > Index.NOT_ANALYZED_NO_NORMS)); > writer.updateDocument(t, ndoc); > // writer.deleteDocuments(q); > // writer.addDocument(ndoc); > } else { > LOG.info("Couldn't find the document via the query"); > } > seacher = new IndexSearcher(writer.getReader()); > hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); > LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 > writer.close(); > {code} > And I have a boundingbox query based on *NumericRangeQuery*. After the > document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
[ https://issues.apache.org/jira/browse/LUCENE-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Sandor updated LUCENE-2863: - Priority: Blocker (was: Major) > Updating a documenting looses its fields that only indexed, also NumericField > tries are completely lost > --- > > Key: LUCENE-2863 > URL: https://issues.apache.org/jira/browse/LUCENE-2863 > Project: Lucene - Java > Issue Type: Bug > Components: Store >Affects Versions: 3.0.2, 3.0.3 > Environment: WindowsXP, Java1.6.20 using a RamDirectory >Reporter: Tamas Sandor >Priority: Blocker > > I have a code snippet (see below) which creates a new document with standard > (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then > it updates the document via adding a new string field. The result is that all > the fields that are not stored but indexed-only and especially NumericFields > the trie tokens are completly lost from index after update or delete/add. > {code:java} > Directory ramDir = new RamDirectory(); > IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), > MaxFieldLength.UNLIMITED); > Document doc = new Document(); > doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(51.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-0.08913399651646614d)); > writer.addDocument(doc); > doc = new Document(); > doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); > doc.add(new NumericField("LAT", Store.YES, > true).setDoubleValue(101.48826603066d)); > doc.add(new NumericField("LNG", Store.YES, > true).setDoubleValue(-100.08913399651646614d)); > writer.addDocument(doc); > Term t = new Term("ID", "HO1234"); > Query q = new TermQuery(t); > IndexSearcher seacher = new IndexSearcher(writer.getReader()); > TopDocs hits = seacher.search(q, 1); > if (hits.scoreDocs.length > 0) { > Document ndoc = seacher.doc(hits.scoreDocs[0].doc); > ndoc.add(new Field("FINAL", "FINAL", Store.YES, > Index.NOT_ANALYZED_NO_NORMS)); > writer.updateDocument(t, ndoc); > // writer.deleteDocuments(q); > // writer.addDocument(ndoc); > } else { > LOG.info("Couldn't find the document via the query"); > } > seacher = new IndexSearcher(writer.getReader()); > hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); > LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 > writer.close(); > {code} > And I have a boundingbox query based on *NumericRangeQuery*. After the > document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2863) Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost
Updating a documenting looses its fields that only indexed, also NumericField tries are completely lost --- Key: LUCENE-2863 URL: https://issues.apache.org/jira/browse/LUCENE-2863 Project: Lucene - Java Issue Type: Bug Components: Store Affects Versions: 3.0.3, 3.0.2 Environment: WindowsXP, Java1.6.20 using a RamDirectory Reporter: Tamas Sandor I have a code snippet (see below) which creates a new document with standard (stored, indexed), *not-stored, indexed-only* and some *NumericFields*. Then it updates the document via adding a new string field. The result is that all the fields that are not stored but indexed-only and especially NumericFields the trie tokens are completly lost from index after update or delete/add. {code:java} Directory ramDir = new RamDirectory(); IndexWriter writer = new IndexWriter(ramDir, new WhitespaceAnalyzer(), MaxFieldLength.UNLIMITED); Document doc = new Document(); doc.add(new Field("ID", "HO1234", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); doc.add(new Field("PATTERN", "HELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); doc.add(new NumericField("LAT", Store.YES, true).setDoubleValue(51.48826603066d)); doc.add(new NumericField("LNG", Store.YES, true).setDoubleValue(-0.08913399651646614d)); writer.addDocument(doc); doc = new Document(); doc.add(new Field("ID", "HO", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); doc.add(new Field("PATTERN", "BELLO", Store.NO, Index.NOT_ANALYZED_NO_NORMS)); doc.add(new NumericField("LAT", Store.YES, true).setDoubleValue(101.48826603066d)); doc.add(new NumericField("LNG", Store.YES, true).setDoubleValue(-100.08913399651646614d)); writer.addDocument(doc); Term t = new Term("ID", "HO1234"); Query q = new TermQuery(t); IndexSearcher seacher = new IndexSearcher(writer.getReader()); TopDocs hits = seacher.search(q, 1); if (hits.scoreDocs.length > 0) { Document ndoc = seacher.doc(hits.scoreDocs[0].doc); ndoc.add(new Field("FINAL", "FINAL", Store.YES, Index.NOT_ANALYZED_NO_NORMS)); writer.updateDocument(t, ndoc); // writer.deleteDocuments(q); // writer.addDocument(ndoc); } else { LOG.info("Couldn't find the document via the query"); } seacher = new IndexSearcher(writer.getReader()); hits = seacher.search(new TermQuery(new Term("PATTERN", "HELLO")), 1); LOG.info("_hits HELLO:" + hits.totalHits); // should be 1 but it's 0 writer.close(); {code} And I have a boundingbox query based on *NumericRangeQuery*. After the document update it doesn't return any hit. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
[ https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2831: Attachment: LUCENE-2831-nuke-SolrIndexReader.patch updated to trunk > Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context > - > > Key: LUCENE-2831 > URL: https://issues.apache.org/jira/browse/LUCENE-2831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, > LUCENE-2831-nuke-SolrIndexReader.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch > > > Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, > boolean, boolean) we should / could revise the API and pass in a struct that > has parent reader, sub reader, ord of that sub. The ord mapping plus the > context with its parent would make several issues way easier. See > LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2723) Speed up Lucene's low level bulk postings read API
[ https://issues.apache.org/jira/browse/LUCENE-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980949#action_12980949 ] Simon Willnauer commented on LUCENE-2723: - the blocker has been committed - we should merge though! > Speed up Lucene's low level bulk postings read API > -- > > Key: LUCENE-2723 > URL: https://issues.apache.org/jira/browse/LUCENE-2723 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-2723-termscorer.patch, > LUCENE-2723-termscorer.patch, LUCENE-2723-termscorer.patch, > LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, LUCENE-2723.patch, > LUCENE-2723.patch, LUCENE-2723_bulkvint.patch, LUCENE-2723_facetPerSeg.patch, > LUCENE-2723_facetPerSeg.patch, LUCENE-2723_openEnum.patch, > LUCENE-2723_termscorer.patch, LUCENE-2723_wastedint.patch > > > Spinoff from LUCENE-1410. > The flex DocsEnum has a simple bulk-read API that reads the next chunk > of docs/freqs. But it's a poor fit for intblock codecs like FOR/PFOR > (from LUCENE-1410). This is not unlike sucking coffee through those > tiny plastic coffee stirrers they hand out airplanes that, > surprisingly, also happen to function as a straw. > As a result we see no perf gain from using FOR/PFOR. > I had hacked up a fix for this, described at in my blog post at > http://chbits.blogspot.com/2010/08/lucene-performance-with-pfordelta-codec.html > I'm opening this issue to get that work to a committable point. > So... I've worked out a new bulk-read API to address performance > bottleneck. It has some big changes over the current bulk-read API: > * You can now also bulk-read positions (but not payloads), but, I > have yet to cutover positional queries. > * The buffer contains doc deltas, not absolute values, for docIDs > and positions (freqs are absolute). > * Deleted docs are not filtered out. > * The doc & freq buffers need not be "aligned". For fixed intblock > codecs (FOR/PFOR) they will be, but for varint codecs (Simple9/16, > Group varint, etc.) they won't be. > It's still a work in progress... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-2694. - Resolution: Fixed Committed revision 1058328. > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980941#action_12980941 ] Simon Willnauer commented on LUCENE-2694: - bq. Actually I see PK lookups faster - 23 usec w/ patch vs 33 usec w/ trunk (per lookup) for 20K lookups. so I run that on a 32bit machine which is quite slow in general though. I will further investigate that on 32bit platform vs. 64 bit. Yet, I only used 1k lookups though. > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2694: Attachment: LUCENE-2694.patch Here is a final patch, I opened up Terms#getThreadTermsEnum() to reuse TermsEnum in PRTE#build(). PRTE#build() now also accepts a boolean if the termlookup should be cached or not which makes sense for common TermQuery. I will commit that shortly - yay! > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980936#action_12980936 ] Dawid Weiss commented on SOLR-2282: --- Robert, can you somehow check if it's the input that's causing these errors? SEVERE: java.lang.Error: Error: could not match input I don't have any idea when such an error could happen, but it doesn't seem to be related to concurrency (at first glance). > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-3.x - Build # 3686 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3686/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety Error Message: unable to create new native thread Stack Trace: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:614) at org.apache.lucene.search.TestThreadSafe.doTest(TestThreadSafe.java:133) at org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety(TestThreadSafe.java:152) at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) Build Log (for compile errors): [...truncated 8582 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980921#action_12980921 ] Michael McCandless commented on LUCENE-2694: Actually I see PK lookups faster -- 23 usec w/ patch vs 33 usec w/ trunk (per lookup) for 20K lookups. And good speedups on many-term MTQs when I force BQ rewrite: ||Query||QPS base||QPS termstate||Pct diff |+nebraska +state|169.75|154.64|{color:red}-8.9%{color}| |doctitle:.*[Uu]nited.*|4.26|4.11|{color:red}-3.5%{color}| |+unit +state|11.40|11.09|{color:red}-2.7%{color}| |spanFirst(unit, 5)|17.38|16.93|{color:red}-2.6%{color}| |spanNear([unit, state], 10, true)|4.37|4.32|{color:red}-1.2%{color}| |"unit state"~3|4.94|4.89|{color:red}-1.0%{color}| |"unit state"|8.05|8.03|{color:red}-0.2%{color}| |state|26.58|26.76|{color:green}0.7%{color}| |unit state|11.24|11.46|{color:green}1.9%{color}| |united~2.0|3.87|3.98|{color:green}2.8%{color}| |doctimesecnum:[1 TO 6]|8.26|8.70|{color:green}5.3%{color}| |unit~2.0|10.04|10.59|{color:green}5.4%{color}| |united~1.0|16.84|18.13|{color:green}7.7%{color}| |unit~1.0|10.09|10.99|{color:green}8.9%{color}| |un*d|11.96|21.63|{color:green}80.8%{color}| |unit*|7.60|14.23|{color:green}87.3%{color}| |u*d|2.22|4.17|{color:green}87.8%{color}| |uni*|1.83|3.53|{color:green}93.7%{color}| +1 to commit! > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-3.x - Build # 3685 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3685/ No tests ran. Build Log (for compile errors): [...truncated 4447 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-3.x - Build # 238 - Failure
But, how can we make this relatively "private"? Ie I don't want people searching via Google to stumble on this hudson summary page showing how flakey our clover build is Or, can we somehow force hudson to "pass" if all tests passed? Mike On Wed, Jan 12, 2011 at 11:19 AM, Robert Muir wrote: > On Wed, Jan 12, 2011 at 6:09 AM, Michael McCandless > wrote: >> Can we do something about these "false" failures? >> >> They are failing because of this Hudson bug: >> >> http://issues.hudson-ci.org/browse/HUDSON-7836 >> >> But the highish failure rate makes us look bad when people look at our >> build stability... which is awful. >> > > One idea would be to divorce the clover, etc from the nightly builds, > and make a clover build. > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SegmentInfo clone
> it is set on DocumentsWriter#flush though Thanks! I just skip segmentCodecs if it's null, for now. On Wed, Jan 12, 2011 at 11:05 AM, Simon Willnauer wrote: > On Wed, Jan 12, 2011 at 8:03 PM, Jason Rutherglen > wrote: >> Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs >> because it's never set (in trunk). > it is set on DocumentsWriter#flush though > > simon >> >> On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen >> wrote: >>> Is it intentional that SegmentInfo.segmentCodecs isn't cloned? When >>> SI is cloned, then sizeInBytes fails with an NPE. >>> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-2856: - Attachment: LUCENE-2856.patch The aborted merge event is now generated and tested for. > Create IndexWriter event listener, specifically for merges > -- > > Key: LUCENE-2856 > URL: https://issues.apache.org/jira/browse/LUCENE-2856 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Affects Versions: 4.0 >Reporter: Jason Rutherglen > Attachments: LUCENE-2856.patch, LUCENE-2856.patch, LUCENE-2856.patch > > > The issue will allow users to monitor merges occurring within IndexWriter > using a callback notifier event listener. This can be used by external > applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980881#action_12980881 ] Stanislaw Osinski commented on SOLR-2282: - Sure, I'll take a look at it tomorrow morning. > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-2856: - Attachment: LUCENE-2856.patch Here's a first cut including workarounds to avoid NPEs and file not found exceptions in SegmentInfo (when calling size in bytes). There's a test case for merge init, start, and complete. I need to add one for abort. > Create IndexWriter event listener, specifically for merges > -- > > Key: LUCENE-2856 > URL: https://issues.apache.org/jira/browse/LUCENE-2856 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Affects Versions: 4.0 >Reporter: Jason Rutherglen > Attachments: LUCENE-2856.patch, LUCENE-2856.patch > > > The issue will allow users to monitor merges occurring within IndexWriter > using a callback notifier event listener. This can be used by external > applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SegmentInfo clone
On Wed, Jan 12, 2011 at 8:03 PM, Jason Rutherglen wrote: > Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs > because it's never set (in trunk). it is set on DocumentsWriter#flush though simon > > On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen > wrote: >> Is it intentional that SegmentInfo.segmentCodecs isn't cloned? When >> SI is cloned, then sizeInBytes fails with an NPE. >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SegmentInfo clone
Sorry, that's incorrect, SegmentInfo.files is NPE'ing on segmentCodecs because it's never set (in trunk). On Wed, Jan 12, 2011 at 10:59 AM, Jason Rutherglen wrote: > Is it intentional that SegmentInfo.segmentCodecs isn't cloned? When > SI is cloned, then sizeInBytes fails with an NPE. > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: call python from java - what strategy do you use?
On Wed, 12 Jan 2011, Roman Chyla wrote: And if in the python, I will do: import lucene import lucene.initVM(lucene.CLASSPATH) Will it work in this case? Giving access to the java classes from inside python. Or I will have to forget pylucene, and prepare some extra java classes? (the jcc in reverse trick, as you put it) Yes, just be sure to JCC-build your eggs with --import lucene so that you don't wrap lucene multiple times. - you say that threads are not managed by the Python VM, does that mean there is no Python GIL? No, there is a Pythonn GIL (and that is the Achille's Heel of this setup if you expect high concurrent servlet performance from your server calling Python). That Python GIL is connected to this thread state I was mentioning earlier. Because the thread is not managed by Python, when Python is called (by way of the code generated by JCC) it doesn't find a thread state for the thread and creates one. When the call completes, the thread state is destroyed because its refcount goes to zero. My TerminatingThread class acquires a Python thread state and keeps it for the life of the thread, thereby working this problem around. OK, this then looks like a normal Python - which is somehow making me less worried :) I wanted to use multiprocessing inside python to deal with GIL, and I see no reason why it should not work in this case. I tried that approach originally and gave up. There were too many strange lock-ups talking to python subprocesses managed by multiprocessing. Now, this was before it became part of Python's distribution so maybe bugs got fixed since then. Andi..
SegmentInfo clone
Is it intentional that SegmentInfo.segmentCodecs isn't cloned? When SI is cloned, then sizeInBytes fails with an NPE. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: call python from java - what strategy do you use?
On Wed, 12 Jan 2011, Roman Chyla wrote: Hi Andi, all, I tried to implement the PythonVM wrapping on Mac 10.6, with JDK 1.6.22, jcc is freshly built, in shared mode, v. 2.6. The python is the standard Python distributed with MacOsX When I try to run the java, it throws an error when it gets to: static { System.loadLibrary("jcc"); } I am getting this error: Exception in thread "main" java.lang.UnsatisfiedLinkError: /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib: Symbol not found: _PyExc_RuntimeError Referenced from: That's because Python's shared library wasn't found. The reason is that, by default, Python's shared lib not on JCC's link line because normally JCC is loaded into a Python process and the dynamic linker thus finds the symbols needed inside the process. Here, since you're not starting inside a Python process, you need to add '-framework Python' to JCC's LFLAGS in setup.py so that the dynamic linker can find the Python VM shared lib and load it. Andi.. /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib Expected in: flat namespace in /Library/Python/2.6/site-packages/JCC-2.6-py2.6-macosx-10.6-universal.egg/libjcc.dylib at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1823) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1746) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1045) at org.apache.jcc.PythonVM.(PythonVM.java:23) at rca.solr.JettyRunnerPythonVM.start(JettyRunnerPythonVM.java:53) at rca.solr.JettyRunnerPythonVM.main(JettyRunnerPythonVM.java:139) MacBeth:JCC-2.6-py2.6-macosx-10.6-universal.egg rca$ nm libjcc.dylib | grep Exc U _PyExc_RuntimeError U _PyExc_TypeError U _PyExc_ValueError 3442 T __ZNK6JCCEnv15reportExceptionEv 21f0 T __ZNK6JCCEnv23getPythonExceptionClassEv Any pointers what I could do wrong? Note, I haven't built any emql.egg yet, I just run my java program and try to start PythonVM() and see if that works. Thanks, roman On Wed, Jan 12, 2011 at 11:05 AM, Roman Chyla wrote: Hi Andi, I think I will give it a try, if only because I am curious. Please see one remaining question below. On Tue, Jan 11, 2011 at 10:37 PM, Andi Vajda wrote: On Tue, 11 Jan 2011, Roman Chyla wrote: Hi Andy, This is much more than I could have hoped! Just yesterday, I was looking for ways how to embed Python VM in Jetty, as that would be more natural, but found only jepp.sourceforge.net and off-putting was the necessity to compile it against the newly built python. I could not want it from the guys who may need my extension. And I realize only now, that embedding Python in Java is even documented on the website, but honestly i would not know how to do it without your detailed examples. Now to the questions, I apologize, some of them or all must seem very stupid to you - pylucene is used on many platforms and with jcc always worked as expected (i love it!), but is it as reliable in the opposite direction? The PythonVM.java loads "jcc" library, so I wonder if in principle there is any difference in the directionality - but I am not sure. To rephrase my convoluted question: would you expect this wrapping be as reliable as wrapping java inside python is now? I've been using this for over two years, in production. My main worry was memory leaks because a server process is expected to stay up and running for weeks at a time and it's been very stable on that front too. Of course, when there is a bug somewhere that causes your Python VM to crash, the entire server crashes. Just like when the JVM crashes (which is normally rare). In other words, this isn't any less reliable than a standalone Python VM process. It can be tricky, but is possible, to run gdb, pdb and jdb together to step through the three languages involved, python, java and C++. I've had to do this a few times but not in a long time. - in the past, i built jcc libraries on one host and distributed them on various machines. As long the family OS and the python main version were the same, it worked on Win/Lin/Mac just fine. As far as I can tell, this does not change, or will it be dependent on the python against which the egg was built? Distributing binaries is risky. The same caveats apply. I wouldn't do it, even in the simple PyLucene case. unfortunately, I don't have that many choices left - this is not for some client-software scenario, we are running the jobs on the grid, and there I cannot compile the binaries. So, if previously the location of the python interpreter or python minor version did not cause problems, now perhaps it will be different. But that wasn't for the Solr, wrapping Solr is not meant for the grid. - now a little tricky issue; when
Lucene-Solr-tests-only-3.x - Build # 3681 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3681/ 1 tests failed. REGRESSION: org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety Error Message: unable to create new native thread Stack Trace: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:614) at org.apache.lucene.search.TestThreadSafe.doTest(TestThreadSafe.java:133) at org.apache.lucene.search.TestThreadSafe.testLazyLoadThreadSafety(TestThreadSafe.java:152) at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) Build Log (for compile errors): [...truncated 8574 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980853#action_12980853 ] Salman Akram commented on SOLR-1604: I am using SOLR 1.4.1 but integrated this patch in early Nov so maybe you committed the inOrder parameter after that? When you say "Regarding parenthesis inside quotes..." if this works and groups the words in phrase together won't it work for my case e.g. "(a b) c"~10? I guess if SurroundQuery doesn't use any analyzer it would be very difficult to make the existing queries work (I am using Standard Analyzer). > Wildcards, ORs etc inside Phrase Queries > > > Key: SOLR-1604 > URL: https://issues.apache.org/jira/browse/SOLR-1604 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: Next > > Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, > ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch > > > Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports > wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2313) Clear root Entity cache when entity is processed
Clear root Entity cache when entity is processed Key: SOLR-2313 URL: https://issues.apache.org/jira/browse/SOLR-2313 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 1.4.1 Environment: Linux, JDBC, Postgres 8.4.6 Reporter: Shane The current process clears the entity caches once all root entity elements have been imported. When a config file has dozens of root entities, the result is one "idle in transaction" process for each entity processed, effectively eating up the databases available connections. The simple solution would be to clear a root entity's cache once that entity has been processed. The following is a diff that I used in my instance to clear the cache when the entity completed: --- DocBuilder.java 2011-01-12 10:05:58.0 -0700 +++ DocBuilder.java.new 2011-01-12 10:05:31.0 -0700 @@ -435,6 +435,9 @@ writer.log(SolrWriter.END_ENTITY, null, null); } entityProcessor.destroy(); + if(entity.isDocRoot) { + entity.clearCache(); + } } } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()
[ https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980808#action_12980808 ] Robert Muir commented on LUCENE-2751: - There is a downside to this whole issue of course... i think its going to be harder to reproduce test fails since we will be using more multithreading. But I think its a worthwhile tradeoff in being able to detect more thread-safety bugs. If it becomes a huge hassle, we could always disable it by default or enable only with a flag or something like that. > add LuceneTestCase.newSearcher() > > > Key: LUCENE-2751 > URL: https://issues.apache.org/jira/browse/LUCENE-2751 > Project: Lucene - Java > Issue Type: Test > Components: Build >Reporter: Robert Muir > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2751.patch, LUCENE-2751.patch > > > Most tests in the search package don't care about what kind of searcher they > use. > we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in > tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980805#action_12980805 ] Simon Willnauer commented on LUCENE-2694: - I just figured that PKLookups are actually slower with this patch 164 msec for 1000 lookups (164 us per lookup) vs 144 msec for 1000 lookups (144 us per lookup) on trunk. I will dig! > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980803#action_12980803 ] Jason Rutherglen commented on LUCENE-2856: -- I'll added events for flush, open, clone, close and the CompositeSegmentsListener. > Create IndexWriter event listener, specifically for merges > -- > > Key: LUCENE-2856 > URL: https://issues.apache.org/jira/browse/LUCENE-2856 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Affects Versions: 4.0 >Reporter: Jason Rutherglen > Attachments: LUCENE-2856.patch > > > The issue will allow users to monitor merges occurring within IndexWriter > using a callback notifier event listener. This can be used by external > applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2694: Attachment: LUCENE-2694.patch Added Changes.txt entry and fixed the remaining JavaDoc on TermState. My latest benchmark results with that patch are here: {code} unit state3.813.70 -2.9% +nebraska +state 41.26 40.61 -1.6% +unit +state3.953.90 -1.1% spanFirst(unit, 5)4.554.51 -0.9% state 10.11 10.07 -0.3% "unit state"~30.980.98 -0.2% "unit state"1.491.49 -0.0% united~1.03.663.72 1.5% unit~1.02.332.37 1.6% united~2.00.810.83 2.7% unit~2.00.350.38 10.1% u*d0.520.67 29.5% doctitle:.*[Uu]nited.*0.190.25 31.6% un*d3.594.77 33.0% uni*0.560.75 34.9% unit*2.203.15 43.3% {code} I think we are ready to go - I will commit later today if nobody objects > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-3.x - Build # 238 - Failure
On Wed, Jan 12, 2011 at 6:09 AM, Michael McCandless wrote: > Can we do something about these "false" failures? > > They are failing because of this Hudson bug: > > http://issues.hudson-ci.org/browse/HUDSON-7836 > > But the highish failure rate makes us look bad when people look at our > build stability... which is awful. > One idea would be to divorce the clover, etc from the nightly builds, and make a clover build. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stan Burnitt updated SOLR-2312: --- Description: Cannot index documents using the CloudSolrServer. Below is a code snippet that reproduces the error. {code:borderStyle=solid} @Test public void jiraTestCase() { CloudSolrServer solrj = null; try { solrj = new CloudSolrServer("your.zookeeper.localdomain:2181"); // Also tried creating CloudSolrServer using alternative contstuctor below... // public CloudSolrServer(String zkHost, LBHttpSolrServer lbServer) // // LBHttpSolrServer lbHttpSolrServer = new LBHttpSolrServer("http://solr.localdomain:8983/solr";); // solrj = new CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); // // (Same result -- NPE @ line 105 in CloudSolrServer.java) solrj.setDefaultCollection("your-collection"); solrj.setZkClientTimeout(5000); solrj.setZkConnectTimeout(5000); final Collection batch = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 1L, 1.0f); doc.addField("title", "Document A"); doc.addField("description", "Test document"); batch.add(doc); doc = new SolrInputDocument(); doc.addField("id", 2L, 1.0f); doc.addField("title", "Document B"); doc.addField("description", "Another test document"); batch.add(doc); solrj.add(batch); } catch (Exception e) { log.error(e.getMessage(), e); Assert.fail("java.lang.NullPointerException: null \n" + " at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) \n" + " Line 105: NULL request object here --> String collection = request.getParams().get(\"collection\", defaultCollection);"); } finally { solrj.close(); } } {code} was: Cannot index documents. Below is a code snippet that reproduces the error. {code:borderStyle=solid} @Test public void jiraTestCase() { CloudSolrServer solrj = null; try { solrj = new CloudSolrServer("your.zookeeper.localdomain:2181"); // Also tried creating CloudSolrServer using alternative contstuctor below... // public CloudSolrServer(String zkHost, LBHttpSolrServer lbServer) // // LBHttpSolrServer lbHttpSolrServer = new LBHttpSolrServer("http://solr.localdomain:8983/solr";); // solrj = new CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); // // (Same result -- NPE @ line 105 in CloudSolrServer.java) solrj.setDefaultCollection("your-collection"); solrj.setZkClientTimeout(5000); solrj.setZkConnectTimeout(5000); final Collection batch = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 1L, 1.0f); doc.addField("title", "Document A"); doc.addField("description", "Test document"); batch.add(doc); doc = new SolrInputDocument(); doc.addField("id", 2L, 1.0f); doc.addField("title", "Document B"); doc.addField("description", "Another test document"); batch.add(doc); solrj.add(batch); } catch (Exception e) { log.error(e.getMessage(), e); Assert.fail("java.lang.NullPointerException: null \n"
[jira] Commented: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980764#action_12980764 ] Stan Burnitt commented on SOLR-2312: Attempting to add a single document also results in the same NPE at line 105. > CloudSolrServer -- calling add(Collection docs) throws NPE. > -- > > Key: SOLR-2312 > URL: https://issues.apache.org/jira/browse/SOLR-2312 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.0 > Environment: Mac OSX v10.5.8 > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) >Reporter: Stan Burnitt >Priority: Critical > Fix For: 4.0 > > > Cannot index documents. > Below is a code snippet that reproduces the error. > {code:borderStyle=solid} > @Test > public void jiraTestCase() { > CloudSolrServer solrj = null; > > try { > solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181"); > // Also tried creating CloudSolrServer using > alternative contstuctor below... > // public CloudSolrServer(String zkHost, > LBHttpSolrServer lbServer) > // > // LBHttpSolrServer lbHttpSolrServer = new > LBHttpSolrServer("http://solr.localdomain:8983/solr";); > // solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); > // > // (Same result -- NPE @ line 105 in > CloudSolrServer.java) > solrj.setDefaultCollection("your-collection"); > solrj.setZkClientTimeout(5000); > solrj.setZkConnectTimeout(5000); > final Collection batch = new > ArrayList(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 1L, 1.0f); > doc.addField("title", "Document A"); > doc.addField("description", "Test document"); > batch.add(doc); > doc = new SolrInputDocument(); > doc.addField("id", 2L, 1.0f); > doc.addField("title", "Document B"); > doc.addField("description", "Another test > document"); > batch.add(doc); > solrj.add(batch); > } catch (Exception e) { > log.error(e.getMessage(), e); > Assert.fail("java.lang.NullPointerException: > null \n" > + " at > org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) > \n" > + " Line 105: NULL request object here > --> String collection = request.getParams().get(\"collection\", > defaultCollection);"); > } finally { > solrj.close(); > } > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2282: -- Attachment: SOLR-2282_test.patch here's a patch to fix the BaseDistributedTestCase, so clustering and other contribs can set their own home and use it. this fixes the unknown field problem, but i'm still seeing the zzBuffer array index out of bounds exception... perhaps my checkout is somehow out of date... maybe you can test the patch? > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()
[ https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980758#action_12980758 ] Michael McCandless commented on LUCENE-2751: bq. I think this would be preferred before we go optimizing synchronization, because otherwise how do we know if its correct? +1 > add LuceneTestCase.newSearcher() > > > Key: LUCENE-2751 > URL: https://issues.apache.org/jira/browse/LUCENE-2751 > Project: Lucene - Java > Issue Type: Test > Components: Build >Reporter: Robert Muir > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2751.patch, LUCENE-2751.patch > > > Most tests in the search package don't care about what kind of searcher they > use. > we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in > tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-Solr-tests-only-3.x - Build # 3669 - Failure
I committed one possible fix for this... the backwards test was missing a cms.sync(). But I'm not sure that's the cause of the failure... Mike On Wed, Jan 12, 2011 at 8:03 AM, Apache Hudson Server wrote: > Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3669/ > > 1 tests failed. > REGRESSION: org.apache.lucene.index.TestCrash.testWriterAfterCrash > > Error Message: > MockRAMDirectory: file "_0.tis" is still open: cannot overwrite > > Stack Trace: > java.io.IOException: MockRAMDirectory: file "_0.tis" is still open: cannot > overwrite > at > org.apache.lucene.store.MockRAMDirectory.createOutput(MockRAMDirectory.java:221) > at > org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:100) > at > org.apache.lucene.index.TermInfosWriter.(TermInfosWriter.java:85) > at > org.apache.lucene.index.FormatPostingsFieldsWriter.(FormatPostingsFieldsWriter.java:41) > at > org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:84) > at org.apache.lucene.index.TermsHash.flush(TermsHash.java:109) > at org.apache.lucene.index.DocInverter.flush(DocInverter.java:72) > at > org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:59) > at > org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:589) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3299) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3264) > at > org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040) > at > org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2007) > at org.apache.lucene.index.TestCrash.initIndex(TestCrash.java:51) > at > org.apache.lucene.index.TestCrash.testWriterAfterCrash(TestCrash.java:77) > at > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) > > > > > Build Log (for compile errors): > [...truncated 8559 lines...] > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980756#action_12980756 ] Stan Burnitt edited comment on SOLR-2312 at 1/12/11 10:45 AM: -- The CloudSolrServer is always instantiated with a LBHttpSolrServer, created in the single argument construct, or passed by the user in the alternative constructor. However, the LBHttpSolrServer's javadoc states: "LBHttpSolrServer is a load balancing wrapper to CommonsHttpSolrServer. This is useful when you have multiple SolrServers and the requests need to be Load Balanced among them. This should *NOT* be used for indexing. Also see the wiki page." (Not sure if this is relevant.) One more note: this problem also occurs when I try to delete an index containing 0 documents. was (Author: stan-b): The CloudSolrServer is always instantiated with a LBHttpSolrServer, created in the single argument construct, or passed by the user in the alternative constructor. However, the LBHttpSolrServer's javadoc states: "LBHttpSolrServer is a load balancing wrapper to CommonsHttpSolrServer. This is useful when you have multiple SolrServers and the requests need to be Load Balanced among them. This should NOT be used for indexing. Also see the wiki page." (Not sure if this is relevant.) One more note: this problem also occurs when I try to delete an index containing 0 documents. > CloudSolrServer -- calling add(Collection docs) throws NPE. > -- > > Key: SOLR-2312 > URL: https://issues.apache.org/jira/browse/SOLR-2312 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.0 > Environment: Mac OSX v10.5.8 > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) >Reporter: Stan Burnitt >Priority: Critical > Fix For: 4.0 > > > Cannot index documents. > Below is a code snippet that reproduces the error. > Cannot index documents. > Below is a snippet for reproducing the error. > {code:borderStyle=solid} > @Test > public void jiraTestCase() { > CloudSolrServer solrj = null; > > try { > solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181"); > // Also tried creating CloudSolrServer using > alternative contstuctor below... > // public CloudSolrServer(String zkHost, > LBHttpSolrServer lbServer) > // > // LBHttpSolrServer lbHttpSolrServer = new > LBHttpSolrServer("http://solr.localdomain:8983/solr";); > // solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); > // > // (Same result -- NPE @ line 105 in > CloudSolrServer.java) > solrj.setDefaultCollection("your-collection"); > solrj.setZkClientTimeout(5000); > solrj.setZkConnectTimeout(5000); > final Collection batch = new > ArrayList(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 1L, 1.0f); > doc.addField("title", "Document A"); > doc.addField("description", "Test document"); > batch.add(doc); > doc = new SolrInputDocument(); > doc.addField("id", 2L, 1.0f); > doc.addField("title", "Document B"); > doc.addField("description", "Another test > document"); > batch.add(doc); > solrj.add(batch); > } catch (Exception e) { > log.error(e.getMessage(), e); > Assert.fail("java.lang.NullPointerException: > null \n" > + " at > org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) > \n" > + " Line 105: NULL request object here > --> String collection = request.getParams().get(\"collection\", > defaultCollection);"); > } finally { > solrj.close(); > } > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to
[jira] Updated: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stan Burnitt updated SOLR-2312: --- Description: Cannot index documents. Below is a code snippet that reproduces the error. {code:borderStyle=solid} @Test public void jiraTestCase() { CloudSolrServer solrj = null; try { solrj = new CloudSolrServer("your.zookeeper.localdomain:2181"); // Also tried creating CloudSolrServer using alternative contstuctor below... // public CloudSolrServer(String zkHost, LBHttpSolrServer lbServer) // // LBHttpSolrServer lbHttpSolrServer = new LBHttpSolrServer("http://solr.localdomain:8983/solr";); // solrj = new CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); // // (Same result -- NPE @ line 105 in CloudSolrServer.java) solrj.setDefaultCollection("your-collection"); solrj.setZkClientTimeout(5000); solrj.setZkConnectTimeout(5000); final Collection batch = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 1L, 1.0f); doc.addField("title", "Document A"); doc.addField("description", "Test document"); batch.add(doc); doc = new SolrInputDocument(); doc.addField("id", 2L, 1.0f); doc.addField("title", "Document B"); doc.addField("description", "Another test document"); batch.add(doc); solrj.add(batch); } catch (Exception e) { log.error(e.getMessage(), e); Assert.fail("java.lang.NullPointerException: null \n" + " at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) \n" + " Line 105: NULL request object here --> String collection = request.getParams().get(\"collection\", defaultCollection);"); } finally { solrj.close(); } } {code} was: Cannot index documents. Below is a code snippet that reproduces the error. Cannot index documents. Below is a snippet for reproducing the error. {code:borderStyle=solid} @Test public void jiraTestCase() { CloudSolrServer solrj = null; try { solrj = new CloudSolrServer("your.zookeeper.localdomain:2181"); // Also tried creating CloudSolrServer using alternative contstuctor below... // public CloudSolrServer(String zkHost, LBHttpSolrServer lbServer) // // LBHttpSolrServer lbHttpSolrServer = new LBHttpSolrServer("http://solr.localdomain:8983/solr";); // solrj = new CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); // // (Same result -- NPE @ line 105 in CloudSolrServer.java) solrj.setDefaultCollection("your-collection"); solrj.setZkClientTimeout(5000); solrj.setZkConnectTimeout(5000); final Collection batch = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 1L, 1.0f); doc.addField("title", "Document A"); doc.addField("description", "Test document"); batch.add(doc); doc = new SolrInputDocument(); doc.addField("id", 2L, 1.0f); doc.addField("title", "Document B"); doc.addField("description", "Another test document"); batch.add(doc); solrj.add(batch); } catch (Exception e) { log.error(e.getMessage(), e); Asse
[jira] Commented: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
[ https://issues.apache.org/jira/browse/SOLR-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980756#action_12980756 ] Stan Burnitt commented on SOLR-2312: The CloudSolrServer is always instantiated with a LBHttpSolrServer, created in the single argument construct, or passed by the user in the alternative constructor. However, the LBHttpSolrServer's javadoc states: "LBHttpSolrServer is a load balancing wrapper to CommonsHttpSolrServer. This is useful when you have multiple SolrServers and the requests need to be Load Balanced among them. This should NOT be used for indexing. Also see the wiki page." (Not sure if this is relevant.) One more note: this problem also occurs when I try to delete an index containing 0 documents. > CloudSolrServer -- calling add(Collection docs) throws NPE. > -- > > Key: SOLR-2312 > URL: https://issues.apache.org/jira/browse/SOLR-2312 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.0 > Environment: Mac OSX v10.5.8 > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) >Reporter: Stan Burnitt >Priority: Critical > Fix For: 4.0 > > > Cannot index documents. > Below is a code snippet that reproduces the error. > Cannot index documents. > Below is a snippet for reproducing the error. > {code:borderStyle=solid} > @Test > public void jiraTestCase() { > CloudSolrServer solrj = null; > > try { > solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181"); > // Also tried creating CloudSolrServer using > alternative contstuctor below... > // public CloudSolrServer(String zkHost, > LBHttpSolrServer lbServer) > // > // LBHttpSolrServer lbHttpSolrServer = new > LBHttpSolrServer("http://solr.localdomain:8983/solr";); > // solrj = new > CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); > // > // (Same result -- NPE @ line 105 in > CloudSolrServer.java) > solrj.setDefaultCollection("your-collection"); > solrj.setZkClientTimeout(5000); > solrj.setZkConnectTimeout(5000); > final Collection batch = new > ArrayList(); > SolrInputDocument doc = new SolrInputDocument(); > doc.addField("id", 1L, 1.0f); > doc.addField("title", "Document A"); > doc.addField("description", "Test document"); > batch.add(doc); > doc = new SolrInputDocument(); > doc.addField("id", 2L, 1.0f); > doc.addField("title", "Document B"); > doc.addField("description", "Another test > document"); > batch.add(doc); > solrj.add(batch); > } catch (Exception e) { > log.error(e.getMessage(), e); > Assert.fail("java.lang.NullPointerException: > null \n" > + " at > org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) > \n" > + " Line 105: NULL request object here > --> String collection = request.getParams().get(\"collection\", > defaultCollection);"); > } finally { > solrj.close(); > } > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2312) CloudSolrServer -- calling add(Collection docs) throws NPE.
CloudSolrServer -- calling add(Collection docs) throws NPE. -- Key: SOLR-2312 URL: https://issues.apache.org/jira/browse/SOLR-2312 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Environment: Mac OSX v10.5.8 java version "1.6.0_22" Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-9M3263) Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) Reporter: Stan Burnitt Priority: Critical Fix For: 4.0 Cannot index documents. Below is a code snippet that reproduces the error. Cannot index documents. Below is a snippet for reproducing the error. {code:borderStyle=solid} @Test public void jiraTestCase() { CloudSolrServer solrj = null; try { solrj = new CloudSolrServer("your.zookeeper.localdomain:2181"); // Also tried creating CloudSolrServer using alternative contstuctor below... // public CloudSolrServer(String zkHost, LBHttpSolrServer lbServer) // // LBHttpSolrServer lbHttpSolrServer = new LBHttpSolrServer("http://solr.localdomain:8983/solr";); // solrj = new CloudSolrServer("your.zookeeper.localdomain:2181", lbHttpSolrServer); // // (Same result -- NPE @ line 105 in CloudSolrServer.java) solrj.setDefaultCollection("your-collection"); solrj.setZkClientTimeout(5000); solrj.setZkConnectTimeout(5000); final Collection batch = new ArrayList(); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", 1L, 1.0f); doc.addField("title", "Document A"); doc.addField("description", "Test document"); batch.add(doc); doc = new SolrInputDocument(); doc.addField("id", 2L, 1.0f); doc.addField("title", "Document B"); doc.addField("description", "Another test document"); batch.add(doc); solrj.add(batch); } catch (Exception e) { log.error(e.getMessage(), e); Assert.fail("java.lang.NullPointerException: null \n" + " at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:105) \n" + " Line 105: NULL request object here --> String collection = request.getParams().get(\"collection\", defaultCollection);"); } finally { solrj.close(); } } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980749#action_12980749 ] Robert Muir commented on SOLR-2282: --- sorry guys, i screwed this up, by not adding logic to the BaseDistributedTestCase to make it work for contribs, from resources. I saw that it extended SolrTestCaseJ4 but I neglected to realize that it doesnt use initCore, so i'll take a look at fixing this. > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980748#action_12980748 ] Robert Muir commented on LUCENE-2694: - bq. seems like this time perf rules out purity in the interface. I know i didn't like this aspect of the patch, but I am ok with it for now as long as we keep things experimental and try to keep an eye on improving the 'purity' of TermsEnum a bit. we are making a lot of progress on the terms handling with flexible indexing and i could easily see more interesting implementations being available other than just PrefixCoded... In some ideal world I guess i'd prefer if TermsEnum was an attributesource with seek() and next(), FilteredTermsEnum was like tokenFilter, and TermState was just captureState/restoreState... but I agree we should just lean towards whatever works for now. definitely like it better now that things such as docFreq() are pulled out of termstate and its completely opaque, i think this is the right way to go. > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1058162 - in /lucene/dev/trunk/solr/contrib/clustering/src/test: java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java resources/solr-clustering/ resources/solr/
I don't think we should do this, as it really confuses the build. Because test-resources are now resources, their contents should not conflict with each other. this is why contribs now have solr-XXX directories. If you want to change this in SolrTestCaseJ4: use initCore(xxx, yyy, "solr-clustering") if you want to change this in AbstractSolrTestCase: you override: public String getSolrHome() Can you please rename the directory back? It seems we just need to fix BaseDistributedTestCase to allow you to override this parameter, i can help On Wed, Jan 12, 2011 at 9:58 AM, wrote: > Author: koji > Date: Wed Jan 12 14:58:49 2011 > New Revision: 1058162 > > URL: http://svn.apache.org/viewvc?rev=1058162&view=rev > Log: > SOLR-2282: rename solr-clustering to solr > > Added: > lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr/ > - copied from r1058152, > lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr-clustering/ > Removed: > > lucene/dev/trunk/solr/contrib/clustering/src/test/resources/solr-clustering/ > Modified: > > lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java > > Modified: > lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java > URL: > http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java?rev=1058162&r1=1058161&r2=1058162&view=diff > == > --- > lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java > (original) > +++ > lucene/dev/trunk/solr/contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTestCase.java > Wed Jan 12 14:58:49 2011 > @@ -28,7 +28,7 @@ public abstract class AbstractClustering > > @BeforeClass > public static void beforeClass() throws Exception { > - initCore("solrconfig.xml", "schema.xml", "solr-clustering"); > + initCore("solrconfig.xml", "schema.xml", "solr"); > numberOfDocs = 0; > for (String[] doc : DOCUMENTS) { > assertNull(h.validateUpdate(adoc("id", Integer.toString(numberOfDocs), > "url", doc[0], "title", doc[1], "snippet", doc[2]))); > > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980741#action_12980741 ] Koji Sekiguchi commented on SOLR-2282: -- I've committed the fix for "unknown field 'url'". > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980740#action_12980740 ] Robert Muir commented on LUCENE-2793: - {quote} I'm perfectly OK with that approach (having some module FSDir checks). I also feel uneasy having JNI in core. What I don't want to see, is Directory impls that you can't use on their own. If you can only use it for merging, then it's not a Directory, it breaks the contract! - move the code elsewhere. {quote} Right, i think we all agree we want to fix the DirectIOLinuxDirectory into being a 'real' directory? As i said before, from a practical perspective, it could be named LinuxDirectory, extend NIOFS, and when openInput(IOContext=Merge) it opens its special input. but personally i don't care how we actually implement it 'becoming a real directory'. this is another issue, unrelated to this one really. this issue is enough and should stand on its own... we should be able to do enough nice things here without dealing with JNI: improving our existing directory impls to use larger buffer sizes by default when merging, etc (like in your example). > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980736#action_12980736 ] Earwin Burrfoot commented on LUCENE-2793: - {quote} As I said before though, i wouldn't mind if we had something more like a 'modules/native' and FSDirectory checked, if this was available and automagically used it... but I can't see myself thinking that we should put this logic into fsdir itself, sorry. {quote} I'm perfectly OK with that approach (having some module FSDir checks). I also feel uneasy having JNI in core. What I don't want to see, is Directory impls that you can't use on their own. If you can only use it for merging, then it's not a Directory, it breaks the contract! - move the code elsewhere. > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()
[ https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980735#action_12980735 ] Robert Muir commented on LUCENE-2751: - We should be able to actually implement this issue now right? Implement LuceneTestCase.newSearcher(), and like the previous patch, sometimes use parallel in tests. I think this would be preferred before we go optimizing synchronization, because otherwise how do we know if its correct? > add LuceneTestCase.newSearcher() > > > Key: LUCENE-2751 > URL: https://issues.apache.org/jira/browse/LUCENE-2751 > Project: Lucene - Java > Issue Type: Test > Components: Build >Reporter: Robert Muir > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2751.patch, LUCENE-2751.patch > > > Most tests in the search package don't care about what kind of searcher they > use. > we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in > tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980732#action_12980732 ] Earwin Burrfoot commented on LUCENE-2793: - bq. Because in your example code above, it looks like it's added to Directory itself. bq. My problem with your sample code is that it appears that the .setBufferSize method is on Directory itself. Ohoho. My fault, sorry. It should look like: {code} RAMDirectory ramDir = new RAMDirectory(); ramDir.setBufferSize(whatever) // Compilation error! ramDir.createIndexInput(name, context); NIOFSDirectory fsDir = new NIOFSDirectory(); fsDir.setBufferSize(IOContext.NORMAL_READ, 1024); fsDir.setBufferSize(IOContext.MERGE, 4096); fsDir.createIndexInput(name, context) {code} > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980731#action_12980731 ] Robert Muir commented on LUCENE-2793: - bq. The proper way is to test out the things, and then move DirectIO code to the only place it makes sense in - FSDir? Probably make it switch on/off-able, maybe not. I'm not sure it should be there... at least not soon. its not even something you can implement in pure java? we definitely have to keep it still simple and possible for people to use the java library in a platform-indepedent way. its also a bit dangerous, whenever JNI is involvedeven if its working. So I think its craziness, to put this direct-io stuff in fsdirectory itself. As I said before though, i wouldn't mind if we had something more like a 'modules/native' and FSDirectory checked, if this was available and automagically used it... but I can't see myself thinking that we should put this logic into fsdir itself, sorry. bq. Sample code My problem with your sample code is that it appears that the .setBufferSize method is on Directory itself. Again i disagree with this because: * its useless to certain directories like MMapDirectory * its dangerous in the direct-io case (different platforms have strict requirements that things be sector-aligned etc, see the mac case where it actually 'works' if the buffer isnt, but is just slow). I definitely don't like the confusion regarding buffersizes now. A very small % of the time its actually meaningful and should be respected, but most of the time the value is completely bogus. > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching
[ https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-2860. Resolution: Fixed Committed revision 1058147 (3x). Committed revision 1058155 (trunk). > SegmentInfo.sizeInBytes ignore includeDocStore when caching > --- > > Key: LUCENE-2860 > URL: https://issues.apache.org/jira/browse/LUCENE-2860 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2860.patch > > > I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it > doesn't take into account 'includeDocStores'. I.e., if you call it once w/ > 'false' (sizeInBytes won't include the store files) and then with 'true' (or > vice versa), you won't get the right sizeInBytes (it won't re-compute, with > the store files). > I'll fix and add a test case demonstrating the bug. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980727#action_12980727 ] Shai Erera commented on LUCENE-2793: I assume you mean setBufferSize(IOContext, size) should be added to specific Directory impls, and not Directory? Because in your example code above, it looks like it's added to Directory itself. Though we can add it to Directory as well, and do nothing there. It simplifies matters as you don't need to check whether the Dir you receive supports setting buffer size (in case you're not the one creating it). At any rate, this looks like it can work too. > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2694) MTQ rewrite + weight/scorer init should be single pass
[ https://issues.apache.org/jira/browse/LUCENE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2694: Attachment: LUCENE-2694.patch This patch changes TermsEnum#seek(TermState) back to TermsEnum#seek(BytesRef, TermState). Yet, TermState is opaque now and TermsEnum has a default impl for TermsEnum#seek(BytesRef, TermState). Holding the BytesRef in TermState for our PrefixCoded* based codecs seems way too costly though. seems like this time perf rules out purity in the interface. > MTQ rewrite + weight/scorer init should be single pass > -- > > Key: LUCENE-2694 > URL: https://issues.apache.org/jira/browse/LUCENE-2694 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Michael McCandless >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2694-FTE.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, LUCENE-2694.patch, > LUCENE-2694.patch, LUCENE-2694_hack.patch > > > Spinoff of LUCENE-2690 (see the hacked patch on that issue)... > Once we fix MTQ rewrite to be per-segment, we should take it further and make > weight/scorer init also run in the same single pass as rewrite. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980702#action_12980702 ] Ahmet Arslan commented on SOLR-1604: Salman, what you are after is nested proximity search which is not currently available in Solr. For alternatives see http://search-lucene.com/m/94ONm1KRuAv1/ Regarding parenthesis inside quotes and inOrder, they are covered in TestCases. What version of solr did you use? > Wildcards, ORs etc inside Phrase Queries > > > Key: SOLR-1604 > URL: https://issues.apache.org/jira/browse/SOLR-1604 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: Next > > Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, > ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch > > > Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports > wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-3.x - Build # 3669 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/3669/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestCrash.testWriterAfterCrash Error Message: MockRAMDirectory: file "_0.tis" is still open: cannot overwrite Stack Trace: java.io.IOException: MockRAMDirectory: file "_0.tis" is still open: cannot overwrite at org.apache.lucene.store.MockRAMDirectory.createOutput(MockRAMDirectory.java:221) at org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:100) at org.apache.lucene.index.TermInfosWriter.(TermInfosWriter.java:85) at org.apache.lucene.index.FormatPostingsFieldsWriter.(FormatPostingsFieldsWriter.java:41) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:84) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:109) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:72) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:59) at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:589) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3299) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3264) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2007) at org.apache.lucene.index.TestCrash.initIndex(TestCrash.java:51) at org.apache.lucene.index.TestCrash.testWriterAfterCrash(TestCrash.java:77) at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:255) Build Log (for compile errors): [...truncated 8559 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2862) Track total term freq per term
[ https://issues.apache.org/jira/browse/LUCENE-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2862: --- Attachment: LUCENE-2862.patch Patch, adds TermsEnum.totalTermFreq (returns -1 if codec doesn't impl it, or if omitTFAP is on) and Terms.getSumTotalTermFreq (= sum across all terms in this field). > Track total term freq per term > -- > > Key: LUCENE-2862 > URL: https://issues.apache.org/jira/browse/LUCENE-2862 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-2862.patch > > > Right now we track docFreq for each term (how many docs have the > term), but the totalTermFreq (total number of occurrences of this > term, ie sum of freq() for each doc that has the term) is also a > useful stat (for flex scoring, PulsingCodec, etc.). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2862) Track total term freq per term
Track total term freq per term -- Key: LUCENE-2862 URL: https://issues.apache.org/jira/browse/LUCENE-2862 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Right now we track docFreq for each term (how many docs have the term), but the totalTermFreq (total number of occurrences of this term, ie sum of freq() for each doc that has the term) is also a useful stat (for flex scoring, PulsingCodec, etc.). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660 ] Salman Akram edited comment on SOLR-1604 at 1/12/11 6:41 AM: - I integrated the patch and its working fine however, there were couple of issues. One is related to un-ordered proximity which seems to be fixed with the inOrder parameter but its not working for me (doesn't give any error but its still ordered). I will try to get the patch again coz I also merged it in early Nov so maybe it was applied after that. The other issue is that although proximity search works with phrases BUT its not very accurate e.g. If I want to search "a b" within 10 words of "c" the query would end up being "a b c"~10 but this will also return cases where "a" is not necessarily together with "b". Any scenario where these 3 words are within 10 words of each other will match. Is it possible in SOLR to do what I mentioned above? Any other patch? Something like " "a b" c "~10... Note: I was going through Lucene-1486 and there Ahmet mentioned that "Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems there is no difference if I put the parenthesis or not. Thanks! was (Author: salman741): I integrated the patch and its working fine however, there were couple of issues. One is already resolved with the above un-ordered proximity parameters. The issue is that although proximity search works with phrases BUT its not very accurate e.g. If I want to search "a b" within 10 words of "c" the query would end up being "a b c"~10 but this will also return cases where "a" is not necessarily together with "b". Any scenario where these 3 words are within 10 words of each other will match. Is it possible in SOLR to do what I mentioned above? Any other patch? Something like " "a b" c "~10... Note: I was going through Lucene-1486 and there Ahmet mentioned that "Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems there is no difference if I put the parenthesis or not. Thanks! > Wildcards, ORs etc inside Phrase Queries > > > Key: SOLR-1604 > URL: https://issues.apache.org/jira/browse/SOLR-1604 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: Next > > Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, > ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch > > > Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports > wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660 ] Salman Akram edited comment on SOLR-1604 at 1/12/11 6:37 AM: - I integrated the patch and its working fine however, there were couple of issues. One is already resolved with the above un-ordered proximity parameters. The issue is that although proximity search works with phrases BUT its not very accurate e.g. If I want to search "a b" within 10 words of "c" the query would end up being "a b c"~10 but this will also return cases where "a" is not necessarily together with "b". Any scenario where these 3 words are within 10 words of each other will match. Is it possible in SOLR to do what I mentioned above? Any other patch? Something like " "a b" c "~10... Note: I was going through Lucene-1486 and there Ahmet mentioned that "Specifically : "(john johathon) smith"~10 " works perfectly. For me it seems there is no difference if I put the parenthesis or not. Thanks! was (Author: salman741): I integrated the patch and its working fine however, there were couple of issues. One is already resolved with the above un-ordered proximity parameters. The issue is that although proximity search works with phrases BUT its not very accurate e.g. If I want to search "a b" within 10 words of "c" the query would end up being "a b c"~10 but this will also return cases where "a" is not necessarily together with "b". Any scenario where these 3 words are within 10 words of each other will match. Is it possible in SOLR to do what I mentioned above? Any other patch? Something like " "a b" c "~10... Thanks! > Wildcards, ORs etc inside Phrase Queries > > > Key: SOLR-1604 > URL: https://issues.apache.org/jira/browse/SOLR-1604 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: Next > > Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, > ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch > > > Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports > wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
[ https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980673#action_12980673 ] Michael McCandless commented on LUCENE-2831: {quote} bq. Simon - heads up, I'm renaming TermState.copy -> TermState.copyFrom in LUCENE-2857. this comment should be on LUCENE-2694, right?! {quote} Duh, right. But it looks like you got the message anyway ;) > Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context > - > > Key: LUCENE-2831 > URL: https://issues.apache.org/jira/browse/LUCENE-2831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, > LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch > > > Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, > boolean, boolean) we should / could revise the API and pass in a struct that > has parent reader, sub reader, ord of that sub. The ord mapping plus the > context with its parent would make several issues way easier. See > LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2751) add LuceneTestCase.newSearcher()
[ https://issues.apache.org/jira/browse/LUCENE-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980672#action_12980672 ] Michael McCandless commented on LUCENE-2751: bq. Hmmm, it looks like the committed patch serializes loading of caches of multiple segments (for the same field?) Ugh, you're right. I had thought validate was "only" used after initial creation (eg, "typically" to add valid bits in), but in fact, create() calls validate(). Yonik do you have a patch in mind to fix the root cause correctly? I have to say... the new FieldCache code is rather hairy. > add LuceneTestCase.newSearcher() > > > Key: LUCENE-2751 > URL: https://issues.apache.org/jira/browse/LUCENE-2751 > Project: Lucene - Java > Issue Type: Test > Components: Build >Reporter: Robert Muir > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2751.patch, LUCENE-2751.patch > > > Most tests in the search package don't care about what kind of searcher they > use. > we should randomly use MultiSearcher or ParallelMultiSearcher sometimes in > tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2859) Move Multi* and SlowMultiReaderWrapper to contrib
[ https://issues.apache.org/jira/browse/LUCENE-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980667#action_12980667 ] Michael McCandless commented on LUCENE-2859: We could fix that merging uses MultiDocs/AndPositionsEnum. It's not particularly clean because we make Mapping* subclasses to remap the docIDs around deletions. If, instead, we fixed PostingsConsumer.merge to take the subs' enums, instead of a single multi enum, then that method could go segment by segment. > Move Multi* and SlowMultiReaderWrapper to contrib > - > > Key: LUCENE-2859 > URL: https://issues.apache.org/jira/browse/LUCENE-2859 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Uwe Schindler > Fix For: 4.0 > > > We should move SlowMultiReaderWrapper and all Multi* classes to contrib as it > should not be used anymore. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-3.x - Build # 238 - Failure
Can we do something about these "false" failures? They are failing because of this Hudson bug: http://issues.hudson-ci.org/browse/HUDSON-7836 But the highish failure rate makes us look bad when people look at our build stability... which is awful. Mike On Mon, Jan 10, 2011 at 6:50 PM, Apache Hudson Server wrote: > Build: https://hudson.apache.org/hudson/job/Lucene-3.x/238/ > > All tests passed > > Build Log (for compile errors): > [...truncated 21034 lines...] > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980660#action_12980660 ] Salman Akram commented on SOLR-1604: I integrated the patch and its working fine however, there were couple of issues. One is already resolved with the above un-ordered proximity parameters. The issue is that although proximity search works with phrases BUT its not very accurate e.g. If I want to search "a b" within 10 words of "c" the query would end up being "a b c"~10 but this will also return cases where "a" is not necessarily together with "b". Any scenario where these 3 words are within 10 words of each other will match. Is it possible in SOLR to do what I mentioned above? Any other patch? Something like " "a b" c "~10... Thanks! > Wildcards, ORs etc inside Phrase Queries > > > Key: SOLR-1604 > URL: https://issues.apache.org/jira/browse/SOLR-1604 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 1.4 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: Next > > Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, > ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch > > > Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports > wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene-Solr-tests-only-trunk - Build # 3688 - Failure
This one is my fault... I missed some docBase assignments during the last LUCENE-2831 commits. I will fix in a second simon On Wed, Jan 12, 2011 at 11:05 AM, Apache Hudson Server wrote: > Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3688/ > > 3 tests failed. > REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testTermQuery > > Error Message: > expected:<186> but was:<170> > > Stack Trace: > junit.framework.AssertionFailedError: expected:<186> but was:<170> > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) > at > org.apache.lucene.search.TestSubScorerFreqs.testTermQuery(TestSubScorerFreqs.java:151) > > > REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery > > Error Message: > expected:<186> but was:<170> > > Stack Trace: > junit.framework.AssertionFailedError: expected:<186> but was:<170> > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) > at > org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery(TestSubScorerFreqs.java:185) > > > REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery > > Error Message: > expected:<186> but was:<170> > > Stack Trace: > junit.framework.AssertionFailedError: expected:<186> but was:<170> > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) > at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) > at > org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery(TestSubScorerFreqs.java:215) > > > > > Build Log (for compile errors): > [...truncated 2955 lines...] > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2861) Search doesn't return document via query
Search doesn't return document via query Key: LUCENE-2861 URL: https://issues.apache.org/jira/browse/LUCENE-2861 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 3.0.3, 2.9.4, 2.9.1 Environment: Doesn't depend on enviroment Reporter: Zenoviy Veres The query doesn't return document that contain all words from query in correct order. The issue might be within mechanism how do SpanQuerys actually match results (http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/) Please refer for details below. The example text wasn't passed through snowball analyzer, however the issue exists after analyzing too Query: (intend within 3 of message) within 5 of message within 3 of addressed. Text within document: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments Result query: SpanNearQuery spanNear = new SpanNearQuery(new SpanQuery[] { new SpanTermQuery(new Term(BODY, "intended")), new SpanTermQuery(new Term(BODY, "message"))}, 4, false); SpanNearQuery spanNear2 = new SpanNearQuery(new SpanQuery[] {spanNear, new SpanTermQuery(new Term(BODY, "message"))}, 5, false); SpanNearQuery spanNear3 = new SpanNearQuery(new SpanQuery[] {spanNear2, new SpanTermQuery(new Term(BODY, "addressed"))}, 3, false); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching
[ https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980656#action_12980656 ] Michael McCandless commented on LUCENE-2860: Ugh, my bad! Thanks Shai. Patch looks good. > SegmentInfo.sizeInBytes ignore includeDocStore when caching > --- > > Key: LUCENE-2860 > URL: https://issues.apache.org/jira/browse/LUCENE-2860 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2860.patch > > > I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it > doesn't take into account 'includeDocStores'. I.e., if you call it once w/ > 'false' (sizeInBytes won't include the store files) and then with 'true' (or > vice versa), you won't get the right sizeInBytes (it won't re-compute, with > the store files). > I'll fix and add a test case demonstrating the bug. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
[ https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2831: Attachment: LUCENE-2831-nuke-SolrIndexReader.patch this patch cuts over all function query stuff to AtomicReaderContext in solr & lucene. It also nukes SolrIndexReader entirely - yay!! :) I thinks somebody should give this patch a glance though, especially from the solr perspective although all tests pass. I had to make the IndexSearcher(ReaderContext, AtomicContext...) ctor public which is ok I think and I added a new already deprecated method to ValueSource in lucene land to make transition easier. if nobody objects I will commit later today > Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context > - > > Key: LUCENE-2831 > URL: https://issues.apache.org/jira/browse/LUCENE-2831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-2831-nuke-SolrIndexReader.patch, > LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831.patch, LUCENE-2831.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch, > LUCENE-2831_transition_to_atomicCtx.patch > > > Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, > boolean, boolean) we should / could revise the API and pass in a struct that > has parent reader, sub reader, ord of that sub. The ord mapping plus the > context with its parent would make several issues way easier. See > LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980649#action_12980649 ] Earwin Burrfoot commented on LUCENE-2793: - What's with ongoing crazyness? :) bq. DirectIOLinuxDirectory First you introduce a kind of directory that is utterly useless except certain special situations. Then, instead of fixing the directory/folding its code somewhere normal, you try to workaround by switching between directories. What's the point of using abstract classes or interfaces, if you leak their implementation's logic all over the place? Or making DIOLD wrap something. Yeah! Wrap my RAMDir! bq. bufferSize This value is only meaningful to a certain subset of Directory implementations. So the only logical place we want to see this value set - is these very impls. Sample code: {code} Directory ramDir = new RAMDirectory(); ramDir.createIndexInput(name, context); // See, ma? No bufferSizes, they are pointless for RAMDir Directory fsDir = new NIOFSDirectory(); fsDir.setBufferSize(IOContext.NORMAL_READ, 1024); fsDir.setBufferSize(IOContext.MERGE, 4096); fsDir.createIndexInput(name, context) // See, ma? The only one who's really concerned with 'actual' buffer size is this concrete Directory impl // All client code is only concerned with the context. // It's NIOFSDirectory's business to give meaningful interpretation for IOContext and assign the buffer sizes. {code} You don't need custom Directory impls to make DIOLD work, you should freakin' fix it. The proper way is to test out the things, and then move DirectIO code to the only place it makes sense in - FSDir? Probably make it switch on/off-able, maybe not. You don't need custom Directory impls to set buffer sizes (neither cast to BufferedIndexInput!), you should add the setting to these Directories, which make sense of it. > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-trunk - Build # 3688 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3688/ 3 tests failed. REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testTermQuery Error Message: expected:<186> but was:<170> Stack Trace: junit.framework.AssertionFailedError: expected:<186> but was:<170> at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) at org.apache.lucene.search.TestSubScorerFreqs.testTermQuery(TestSubScorerFreqs.java:151) REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery Error Message: expected:<186> but was:<170> Stack Trace: junit.framework.AssertionFailedError: expected:<186> but was:<170> at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) at org.apache.lucene.search.TestSubScorerFreqs.testBooleanQuery(TestSubScorerFreqs.java:185) REGRESSION: org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery Error Message: expected:<186> but was:<170> Stack Trace: junit.framework.AssertionFailedError: expected:<186> but was:<170> at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1049) at org.apache.lucene.search.TestSubScorerFreqs.testPhraseQuery(TestSubScorerFreqs.java:215) Build Log (for compile errors): [...truncated 2955 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Question about null request instance in CloudSolrServer
Hey Stan - Looks like this may be a problem - when I get chance I will take a look - but I'd guess, based on the error, you probably should file a jira issue. The best thing to do in these cases is email the user list or dev list, rather than me individually - it is best to have the history of this issue public for others that may have or will run into the same problem. I don't have the solr user list available under autocomplete, and as it's late and i don't want to forget/lost this email, I'm cc'ing the dev list to make this issue visible. - Mark On Jan 11, 2011, at 2:36 PM, sburnitt wrote: > Hello Mr. Miller, > > I hope I am not contacting you the wrong way, but I need some help with > SolrCloud (solr-trunk) > (I've spent two days googling, studing src code, and asking questions on > #solr IRC... Nada.) > > I am trying to use your org.apache.solr.client.solrj.impl.CloudSolrServer > implementation, but when I try to add(docs), I get an NPE on line 105: > > String collection = request.getParams().get("collection", > defaultCollection); > > It seems the request instance is null and I cannot figure out why. > > I will skip the details here in case email is not the place to discuss this > further. > Do you have the time to help me out? Should I add an issue to JIRA first? > (I just signed up and not sure I would be authorized.) > > Regards, > Stan Burnitt - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2860) SegmentInfo.sizeInBytes ignore includeDocStore when caching
[ https://issues.apache.org/jira/browse/LUCENE-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2860: --- Attachment: LUCENE-2860.patch Patch fixes the bug and adds test case. > SegmentInfo.sizeInBytes ignore includeDocStore when caching > --- > > Key: LUCENE-2860 > URL: https://issues.apache.org/jira/browse/LUCENE-2860 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2860.patch > > > I noticed that SegmentInfo's sizeInBytes cache is potentially buggy -- it > doesn't take into account 'includeDocStores'. I.e., if you call it once w/ > 'false' (sizeInBytes won't include the store files) and then with 'true' (or > vice versa), you won't get the right sizeInBytes (it won't re-compute, with > the store files). > I'll fix and add a test case demonstrating the bug. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980626#action_12980626 ] Shai Erera commented on LUCENE-2856: I see. I'm ok with both then. > Create IndexWriter event listener, specifically for merges > -- > > Key: LUCENE-2856 > URL: https://issues.apache.org/jira/browse/LUCENE-2856 > Project: Lucene - Java > Issue Type: Improvement > Components: Index >Affects Versions: 4.0 >Reporter: Jason Rutherglen > Attachments: LUCENE-2856.patch > > > The issue will allow users to monitor merges occurring within IndexWriter > using a callback notifier event listener. This can be used by external > applications such as Solr to monitor large segment merges. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980625#action_12980625 ] Shai Erera commented on LUCENE-2793: bq. No you dont... you can also call the .setBufferSize like the skiplist reader does. I'd still need to impl my Directory though right? That's the overkill I'm trying to avoid. But, I've been thinking about how would a merge/search code, which can only tell the Directory it's in SEARCH / MERGE context, get the buffer size the application wanted to use in that context. I don't think it has a way to do so without either using a static class, something we try to avoid, or propagating those settings down everywhere, which does not make sense either. So, Robert, I think you're right -- bufferSize should not exist on IOContext. A custom Directory impl seems unavoided. So I think it'd be good if we can create a BufferedDirectoryWrapper which wraps a Directory and offers a BufferedIndexInput/OutputWrapper which delegate the necessary calls to the wrapped Directory. We've found ourselves needing to implement that kind of Directory several times already, and it'd be good if Lucene can offer one. That Directory would then take IOContext -> bufferSize map/properties/whatever and can take that into account in openInput/createOutput. If users will need to impl Directory wrapping, if they want to control buffer sizes, I suggest we make that as painless as possible. > Directory createOutput and openInput should take an IOContext > - > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: Store >Reporter: Michael McCandless > Attachments: LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering
[ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980622#action_12980622 ] Koji Sekiguchi commented on SOLR-2282: -- bq. When I now run the test, I'm getting a different exception, which looks like some misconfiguration of the test itself: Confirmed. contrib/clustering/build.xml seems to be changed in SOLR-2299, but I'm not sure the cause of the failure. > Distributed Support for Search Result Clustering > > > Key: SOLR-2282 > URL: https://issues.apache.org/jira/browse/SOLR-2282 > Project: Solr > Issue Type: New Feature > Components: contrib - Clustering >Affects Versions: 1.4, 1.4.1 >Reporter: Koji Sekiguchi >Assignee: Koji Sekiguchi >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, > SOLR-2282.patch, SOLR-2282.patch > > > Brad Giaccio contributed a patch for this in SOLR-769. I'd like to > incorporate it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org