Make sure your per-node Solr index data for DSE fits completely in the OS
system memory that is available for file system caching (just like we try to
do for OSS Solr!), and limit each node to about 50 million documents or so.
Anything bigger than a 32GB memory node is probably a waste for a DSE Solr
node. A 16GB machine for each DSE Solr node is probably okay, but then you
may have to stay somewhat under that 50 million doc number for each node.
Proper provisioning of the cluster with enough nodes and enough memory per
node and not too many documents per node is essential.
But... none of that has anything to do with your subject question of "data
importer", so... what is the real question here?
-- Jack Krupansky
-----Original Message-----
From: Shuai Zhang
Sent: Sunday, July 13, 2014 11:06 AM
To: solr-user@lucene.apache.org
Subject: Re: Is there any data importer for cassandra in solr?
Hi Alexandre and Jack,
Thanks for your advices. But I still cannot find a better solution for my
requirement.
For now, our Cassandra has very huge data, and solr cluster's indices has
more than 120GB, it must be a very slow process when I rebuild all the
indices with netflix api to fetch all the data from Cassandra(This process
will cost more than 5 months!!! Tooooo slow!!!).
I guess this way maybe not the best way, so I hope I can find another better
way to solve it.
--
Gabriel Zhang
On Sunday, July 13, 2014 8:11 PM, Jack Krupansky <j...@basetechnology.com>
wrote:
Simple csv files are the easiest way to go:
http://www.datastax.com/dev/blog/simple-data-importing-and-exporting-with-cassandra
The Solr Data Import Handler can be used to import from RDBMS databases to
DataStax Enterprise with its Solr integration:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchConfDataHand.html
And you can use csv flat files exported by typical RDBMS's using DSE/Solr as
with regular OSS Solr.
DataStax Enterprise also supports Hadoop/Sqoop for importing from RDBMS
databases:
http://www.datastax.com/2012/03/how-to-move-data-from-relational-databases-to-datastax-enterprise-cassandra-using-sqoop
There are also ETL tools from Talend, Pentaho, and JasperSoft that can be
used to import from RDBMS databases into DataStax Enterprise:
http://www.datastax.com/dev/blog/ways-to-move-data-tofrom-datastax-enterprise-and-cassandra
If those approaches are not sufficient for your needs, maybe you could
elaborate on any special needs you have.
-- Jack Krupansky
-----Original Message-----
From: Shuai Zhang
Sent: Sunday, July 13, 2014 7:38 AM
To: solr-user@lucene.apache.org
Subject: Is there any data importer for cassandra in solr?
Hi all,
For now, we used cassandra as our DB, and I have to rebuild all the indices
for solr, but I cannot find any data importer for cassandra.
So for this condition, how should I do?
Can anyone give me some advices?
Thanks very much~~
Regards,
--
Gabriel Zhang