Hi, I want to creating a POC to search INTRANET along with documents uploaded on intranet. Documents(PDF, excel, word document, text files, images, videos) are also exists on SHAREPOINT. sharepoint has Authentication access at module level(folder level).
My interanet website is http://myintranet/ <http://sparsh/> . and Sharepoint url is different. Documents also exist in file folders. I have below queries: A) Which crawler framework do I use along with Solr for this POC, "Nutch" or "Apache ManifoldCF"? B) Is it possible to crawl Sharepoint documents usiing Nutch? If yes, only configuration level change would make this possible? or I have to write code to parse and send to solr? C) Which version of Solr+nutch+MCF should be used? because nutch version has dependency on solr version. wold nutch 1.7 works properly with solr 4.6.0? -- Rashmi Be the change that you want to see in this world! -- Rashmi Be the change that you want to see in this world! www.minnal.zor.org disha.resolve.at www.artofliving.org