Re: SOLR Performance benchmarking

2014-07-13 Thread Siegfried Goeschl
Hi Rashi, abnormal behaviour depends on your data, system and work load - I have seen abnormal behaviour at customers sites and it turned out to be a miracle that they the customer had no serious problems before :-) * running out of sockets - you might need to check if you have enough sockets

Is there any data importer for cassandra in solr?

2014-07-13 Thread Shuai Zhang
Hi all, For now, we used cassandra as our DB, and I have to rebuild all the indices for solr, but I cannot find any data importer for cassandra. So for this condition, how should I do?    Can anyone give me some advices? Thanks very much~~ Regards, -- Gabriel Zhang

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Alexandre Rafalovitch
Have you looked at DSE (www.datastax.com) and what they offer? I think they had some open-source and/or free content as well. And they specialize in Cassandra. I think there used to be Solandra or some such, but it's been abandoned in favour of DSE work. Regards, Alex. Personal:

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Jack Krupansky
Simple csv files are the easiest way to go: http://www.datastax.com/dev/blog/simple-data-importing-and-exporting-with-cassandra The Solr Data Import Handler can be used to import from RDBMS databases to DataStax Enterprise with its Solr integration:

Join and non-Join query give different results

2014-07-13 Thread atawfik
Hi everyone, I am trying to link two types of documents in my Solr index. The parent is named house and the child is named available. So, I want to return a list of houses that have available documents with some filtering. However, the following query gives me around 18 documents, which is wrong.

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Shuai Zhang
Hi Alexandre and Jack, Thanks for your advices. But I still cannot find a better solution for my requirement. For now, our Cassandra has very huge data, and solr cluster's indices has more than 120GB, it must be a very slow process when I rebuild all the indices with netflix api to fetch all

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Alexandre Rafalovitch
So you've tried all the things above? Not clear what the exact problem is that you are trying to solve. Regards, Alex On 13/07/2014 10:07 pm, Shuai Zhang smalldirec...@yahoo.com.invalid wrote: Hi Alexandre and Jack, Thanks for your advices. But I still cannot find a better solution for

Merge two collections in SolrCloud

2014-07-13 Thread vidit.asthana
What is the best way to merge 2 collections into one? -- View this message in context: http://lucene.472066.n3.nabble.com/Merge-two-collections-in-SolrCloud-tp4146930.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Shuai Zhang
 Hi Alexandre, Do you mean the things are that you mentioned or Jack mentioned?  I tried to search something about DSE, but I cannot find something I need. Maybe I need to search more... Thanks again! Regards, -- Gabriel Zhang On Sunday, July 13, 2014 11:24 PM, Alexandre Rafalovitch

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Jack Krupansky
Make sure your per-node Solr index data for DSE fits completely in the OS system memory that is available for file system caching (just like we try to do for OSS Solr!), and limit each node to about 50 million documents or so. Anything bigger than a 32GB memory node is probably a waste for a

Re: SOLR Performance benchmarking

2014-07-13 Thread Umesh Prasad
Hi Rashi, Also, checkout http://searchhub.org/2010/01/21/the-seven-deadly-sins-of-solr/ .. It would help if you can share your solrconfig.xml and schema.xml .. Some problems are evident from there itself. From our experience we have found 1. JVM Heap size (check for young gen size and

Re: Group only top 50 results not All results.

2014-07-13 Thread Umesh Prasad
Another way is to extend the existing Facets component. FacetsComponent uses SimpleFacets to compute facets where it passes the matching docset (rb.getResults.docSet) as an argument in constructor. Instead you can pass it the ranked docList by passing (rb.getResults.docList). Basically 3 steps

Re: Changing default behavior of solr for overwrite the whole document on uniquekey duplication

2014-07-13 Thread Umesh Prasad
Must Mention here. This Atomic Update will only work if you all your fields are stored. It eases out work on your part, but the stored fields will bloat the index. On 12 July 2014 22:06, Erick Erickson erickerick...@gmail.com wrote: bq: But does performance remain same in this situation

Re: SOLR-6143 Bad facet counts from CollapsingQParserPlugin

2014-07-13 Thread Umesh Prasad
Hi Joel, Actually I also have seen this. The counts given by groups.truncate and collapsingQParserPlugin differ.. We have a golden query framework for our product APIs and there we have seen differences in facet count given. One request uses groups.truncate and another collapsingQParser

Re: Is there any data importer for cassandra in solr?

2014-07-13 Thread Shuai Zhang
Hi Jack, Sorry to confuse you, and thanks for your reply! Because I changed solr document structure so that I have to rebuild all indices again. For our system(Mail System), it used Cassandra as DB, so if I want to rebuild all mails' indeces, I need to use Thrift API to read data from

Re: Add a new replica to SolrCloud

2014-07-13 Thread rulinma
can do this. -- View this message in context: http://lucene.472066.n3.nabble.com/Add-a-new-replica-to-SolrCloud-tp4146229p4146970.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR in Production

2014-07-13 Thread rulinma
AA machine is ok. Maybe SolrCloud is also a good choice for this. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-in-Production-tp4143496p4146972.html Sent from the Solr - User mailing list archive at Nabble.com.