>> The problem I am facing is how to read those data from hard disks which are 
>> not HDFS

If you are planning to use a Map-Reduce job to do the indexing then the source 
data will definitely have to be on HDFS.
The Map function can transform the source data to Solr documents and send them 
to Solr  (e.g. via CloudSolrServer Java API) for indexing.

-- James

-----Original Message-----
From: engy.morsy [mailto:engy.mo...@bibalex.org] 
Sent: Tuesday, June 25, 2013 3:14 AM
To: solr-user@lucene.apache.org
Subject: Solr indexer and Hadoop

Hi All, 

I have TB of data that need to be indexed. I am trying to use hadoop to index 
those TB. I am still newbie. 
I thought that the Map function will read data from hard disks and the reduce 
function will index them. The problem I am facing is how to read those data 
from hard disks which are not HDFS. 

I understand that the data to be indexed must be on HDFS, don't they? or I am 
missing something here. 

I can't convert the nodes on which the data resides to HDFS. Can anyone please 
help.

I would also appreciate if you can provide a good tutorial for solr indexing 
using hadoop. I googled alot but I did not find a sufficient one. 
 
Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexer-and-Hadoop-tp4072951.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to