Re: Solr Performance and Scalability

2010-02-11 Thread Andrzej Bialecki
On 2010-02-11 23:11, Robert Muir wrote: Tom, this is really completely unrelated, but given that you have such huge documents and I see you have exceeded term count limits in lucene, i can't help but wonder if you have ever considered Andrzej's index pruning patch? (it is simply a tool you can ru

[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

2010-02-11 Thread Kevin Osborn (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832711#action_12832711 ] Kevin Osborn commented on SOLR-1365: I am finally getting back around to this. And I am

[jira] Commented: (SOLR-236) Field collapsing

2010-02-11 Thread Gerald DeConto (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832703#action_12832703 ] Gerald DeConto commented on SOLR-236: - I have been able to apply and use the solr-236 pat

Re: Solr Performance and Scalability

2010-02-11 Thread Robert Muir
Tom, this is really completely unrelated, but given that you have such huge documents and I see you have exceeded term count limits in lucene, i can't help but wonder if you have ever considered Andrzej's index pruning patch? (it is simply a tool you can run on your index) depending upon requireme

[jira] Updated: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-02-11 Thread Deepak (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak updated SOLR-1769: - Attachment: solrconfig.xml Attached the correct file now > Solr 1.4 Replication - Repeater throwing NullPointerEx

[jira] Updated: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-02-11 Thread Deepak (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak updated SOLR-1769: - Attachment: (was: solrconfig.xml) > Solr 1.4 Replication - Repeater throwing NullPointerException > --

[jira] Updated: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-02-11 Thread Deepak (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak updated SOLR-1769: - Attachment: solrconfig.xml Hi Paul We have not defined any deletionPolicy in the solrconfig.xml file. We have thi

Re: Solr Performance and Scalability

2010-02-11 Thread Tom Burton-West
The HathiTrust Large Search indexes the OCR from 5 million volumes, with an average of 200-300 pages per volume. So the total number of pages indexed would be over 1 billion. However, we are not using pages as Solr documents, we are using the entire book, so we only have 5 million rather than 1 bi

[jira] Updated: (SOLR-1770) move default example core config/data into a collection1 folder

2010-02-11 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-1770: -- Attachment: SOLR-1770.patch initial patch (won't do it form a patch to handle the moves though) > move

Re: Solr Performance and Scalability

2010-02-11 Thread Eric Pugh
http://people.apache.org/~hossman/#solr-dev I would look at the HathiTrust project analysis: http://wiki.apache.org/solr/SolrPerformanceData Your question is pretty broad, but I don't see any reason why Solr wouldn't work for your problem assuming the project is appropriately resourced!

Solr Performance and Scalability

2010-02-11 Thread Wick2804
We are thinking of creating a Lucene Solr project to store 50million full text OCRed A4 pages. Is there anyone out there who could provide some kind of guidance on the size of index we are likely to generate, and are there any gotchas in the standard analysis engines for load and query that will c

[jira] Commented: (SOLR-1770) move default example core config/data into a collection1 folder

2010-02-11 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832597#action_12832597 ] Yonik Seeley commented on SOLR-1770: +1 If the normal example is multi-core capable, an

[jira] Commented: (SOLR-1395) Integrate Katta

2010-02-11 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832587#action_12832587 ] Jason Rutherglen commented on SOLR-1395: shyjuThomas, It'd be good to update this p

[jira] Created: (SOLR-1770) move default example core config/data into a collection1 folder

2010-02-11 Thread Mark Miller (JIRA)
move default example core config/data into a collection1 folder --- Key: SOLR-1770 URL: https://issues.apache.org/jira/browse/SOLR-1770 Project: Solr Issue Type: Improvement Aff

[jira] Commented: (SOLR-1536) Support for TokenFilters that may modify input documents

2010-02-11 Thread Mike Perham (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832523#action_12832523 ] Mike Perham commented on SOLR-1536: --- Another developer just mentioned that I might be able

[jira] Commented: (SOLR-1536) Support for TokenFilters that may modify input documents

2010-02-11 Thread JIRA
[ https://issues.apache.org/jira/browse/SOLR-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832518#action_12832518 ] Jan Høydahl commented on SOLR-1536: --- In my head document-level modifications belong in Upd

[jira] Commented: (SOLR-1536) Support for TokenFilters that may modify input documents

2010-02-11 Thread Mike Perham (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832493#action_12832493 ] Mike Perham commented on SOLR-1536: --- This would be hugely useful for us in implementing a

[jira] Commented: (SOLR-1395) Integrate Katta

2010-02-11 Thread shyjuThomas (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832465#action_12832465 ] shyjuThomas commented on SOLR-1395: --- Now Katta 0.6 version has been released, and there ar

Build failed in Hudson: Solr-trunk #1056

2010-02-11 Thread Apache Hudson Server
See -- [...truncated 2366 lines...] [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 18.906 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest