Hi,

I want to creating a POC to search INTRANET along with documents uploaded
on intranet. Documents(PDF, excel, word document, text files, images,
videos) are also exists on SHAREPOINT. sharepoint has Authentication access
at module level(folder level).

My interanet website is http://myintranet/ <http://sparsh/> . and
Sharepoint url is different. Documents also exist in file folders.

I have below queries:
A) Which crawler framework do I use along with Solr for this POC, "Nutch"
or "Apache ManifoldCF"?

B) Is it possible to crawl Sharepoint documents usiing Nutch? If yes, only
configuration level change would make this possible? or I have to write
code to parse and send to solr?

C) Which version of Solr+nutch+MCF should be used? because nutch version
has dependency on solr version. wold nutch 1.7 works properly with solr
4.6.0?
-- 
Rashmi
Be the change that you want to see in this world!




-- 
Rashmi
Be the change that you want to see in this world!
www.minnal.zor.org
disha.resolve.at
www.artofliving.org

Reply via email to