Re: Solr and hadoop

2014-09-25 Thread Joel Bernstein
Hi Tom, I am not aware of a Solr InputFormat implementation yet. The /export handier, which outputs entire sorted results sets, was designed to support these types of bulk export operations efficiently. I think a Solr InputFormat would be excellent project to begin working on. Also SOLR-6526 is u

Re: Solr and hadoop

2014-09-25 Thread Tom Chen
I'm aware of the MapReduceIndexerTool (MRIT). That might be solving the indexing part -- the OutputFormat part. But what I asked for is more on the making Solr index data available to Hadoop MapReduce -- making Solr as a data store like what HDFS can provide. With a Solr InputFormat, we can make t

Re: Solr and hadoop

2014-09-25 Thread Michael Della Bitta
Yes, there's SolrInputDocumentWritable and MapReduceIndexerTool, plus the Morphline stuff (check out https://github.com/markrmiller/solr-map-reduce-example). Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street

Solr and hadoop

2014-09-25 Thread Tom Chen
I wonder if Solr has InputFormat and OutputFormat like the EsInputFormat and EsOutputFormat that are provided by Elasticserach for Hadoop (es-hadoop). Is it possible for Solr to provide such integration with Hadoop? Best, Tom