Re: Indexing files Solr cell and Amazon S3

2011-05-30 Thread Jan Høydahl
Hi,

You can use parameter stream.file to tell Solr to read the file from local 
disk, not stream across network:
http://lucene.472066.n3.nabble.com/Example-of-using-quot-stream-file-quot-to-post-a-binary-file-to-solr-td781172.html

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 30. mai 2011, at 22.46, Greg Georges wrote:

> Hello everyone,
> 
> We have our infrastructure on Amazon cloud servers, and we use the S3 file 
> system. We need to index files using Solr Cell. From what I have read, we 
> need to stream files to Solr in order for it to extract the metadata into the 
> index. If we stream data through a public url there will be costs associated 
> to the transfer on the Amazon cloud. We have planned to have a directory with 
> the files, is it possible to tell solr to add documents from a specific 
> folder location? Or must we stream them into Solr? In SolrJ I see that the 
> only option is streaming. Thank you very much.
> 
> Greg



Indexing files Solr cell and Amazon S3

2011-05-30 Thread Greg Georges
Hello everyone,

We have our infrastructure on Amazon cloud servers, and we use the S3 file 
system. We need to index files using Solr Cell. From what I have read, we need 
to stream files to Solr in order for it to extract the metadata into the index. 
If we stream data through a public url there will be costs associated to the 
transfer on the Amazon cloud. We have planned to have a directory with the 
files, is it possible to tell solr to add documents from a specific folder 
location? Or must we stream them into Solr? In SolrJ I see that the only option 
is streaming. Thank you very much.

Greg