Sorry, I did not really parse the question before replying (twice!)
But where does the content-type for the local file specified by the URL go? /solr/update/csv?stream.url=file://myfile.csv Do we need a stream.content-type or charset, or am I missing something?
I just investigated a bit.. and was disappointed with the results. I hoped that the URLConnection would fill in the content type for files, but it doesn't (at least not reliably): http://localhost:8983/solr/debug/dump?stream.url=file:///C:/mmm.xls <str name="contentType">content/unknown</str> http://localhost:8983/solr/debug/dump?stream.url=file:///C:/xxx.jpg <str name="contentType">image/jpeg</str> It sometimes gets the contentType correct, but it never adds charset info. So, we have to do something... we could: a. explicitly set the stream.url.contentType with another param b. return a FileReader directly (it takes care of charset for you) (a) may be useful to override a remote content type that is incorrect, but is kind of a pain for someone specifying a local file (and probably does not know what content Type it is) (b) requires adding getReader() to ContentStream - this would be useful for direct posted content (the most common case) and for the local file case. Other cases would construct the reader from the content type and stream. I vote for (b) and perhaps also (a) - but If we have (b), (a) is only really useful for urls where the content type is incorrect....
