: I was looking at the "Content Streams" page in the ref-guide recently,
: and reviewing some of the warnings about how dangerous enabling
: "remote-streaming" can be. [1]
: 
: I've seen those warnings a good bit in the past, but it dawned on me
: this time that I don't even really know what "remote streaming" and
: its associated parameters (e.g. "stream.body", "stream.url",
: "stream.file") are actually used for.

stream.body isn't very dangerous, it was originally added to make it 
slightly easier on client applications where their HTTP library didn't 
make it easy to POST raw data (or multi-part).

I think the only reason stream.body requires an ENV var to enable is from 
back when some might expose Solr to any public GET request (blocking 
external POST requests via firewall) and didn't want an external client to 
send 'stream.body=<delete>...' ... we have much better ways to do that 
with authn/authz plugins and filtering on UPDATE vs SEARCH.

stream.file was really just a "nice to have" back in the day when you 
might want to have some ETL tool dump a ig data file onto your SOlr nodes 
local disk and then index it w/o any network overhead ... but a security 
headache for sure.

stream.url was likewise a "nice to have" way of making Solr fetch data 
directly from some repository ... but definitely sketchy from a security 
standpoint.


: Are there really critical use-cases that these params enable?  How are
: folks using them?  (Or are they perhaps not used all that much
: anymore?)

I don't know that there were ever "critical" use-cases for any of them,
certainly not for the "remote" streams.  stream.body is probably not used 
much anymore -- I doubt there are many Solr users that can't figure out 
how to send a POST request.





-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to