: I was looking at the "Content Streams" page in the ref-guide recently, : and reviewing some of the warnings about how dangerous enabling : "remote-streaming" can be. [1] : : I've seen those warnings a good bit in the past, but it dawned on me : this time that I don't even really know what "remote streaming" and : its associated parameters (e.g. "stream.body", "stream.url", : "stream.file") are actually used for.
stream.body isn't very dangerous, it was originally added to make it slightly easier on client applications where their HTTP library didn't make it easy to POST raw data (or multi-part). I think the only reason stream.body requires an ENV var to enable is from back when some might expose Solr to any public GET request (blocking external POST requests via firewall) and didn't want an external client to send 'stream.body=<delete>...' ... we have much better ways to do that with authn/authz plugins and filtering on UPDATE vs SEARCH. stream.file was really just a "nice to have" back in the day when you might want to have some ETL tool dump a ig data file onto your SOlr nodes local disk and then index it w/o any network overhead ... but a security headache for sure. stream.url was likewise a "nice to have" way of making Solr fetch data directly from some repository ... but definitely sketchy from a security standpoint. : Are there really critical use-cases that these params enable? How are : folks using them? (Or are they perhaps not used all that much : anymore?) I don't know that there were ever "critical" use-cases for any of them, certainly not for the "remote" streams. stream.body is probably not used much anymore -- I doubt there are many Solr users that can't figure out how to send a POST request. -Hoss http://www.lucidworks.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
