>> The problem I am facing is how to read those data from hard disks which are >> not HDFS
If you are planning to use a Map-Reduce job to do the indexing then the source data will definitely have to be on HDFS. The Map function can transform the source data to Solr documents and send them to Solr (e.g. via CloudSolrServer Java API) for indexing. -- James -----Original Message----- From: engy.morsy [mailto:engy.mo...@bibalex.org] Sent: Tuesday, June 25, 2013 3:14 AM To: solr-user@lucene.apache.org Subject: Solr indexer and Hadoop Hi All, I have TB of data that need to be indexed. I am trying to use hadoop to index those TB. I am still newbie. I thought that the Map function will read data from hard disks and the reduce function will index them. The problem I am facing is how to read those data from hard disks which are not HDFS. I understand that the data to be indexed must be on HDFS, don't they? or I am missing something here. I can't convert the nodes on which the data resides to HDFS. Can anyone please help. I would also appreciate if you can provide a good tutorial for solr indexing using hadoop. I googled alot but I did not find a sufficient one. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951.html Sent from the Solr - User mailing list archive at Nabble.com.