tmgstevens commented on PR #351:
URL: https://github.com/apache/flume/pull/351#issuecomment-1300061047

   So a couple of points in here:
   
   > It packages all the artifacts whether they are required or not.
   
   True - packaging is something that I think we need to think about going 
forwards. For people who want a lightweight deployment, should we offer 
different profiles, the flipside being that if you're combining two components 
from different modules (e.g. syslog and kafka, HTTP and HDFS etc) then actually 
do you ever get the benefit of the modularity, or does it just ramp up your 
complexity (complexity = adoption blocker in my mind).
   
   > Unless I am mistaken, it is getting the distribution tar and using the 
configuration located within that. I fail to see how useful that will be as I 
would expect most users would have a custom configuration.
   
   It does bundle the default conf directory, but it is anticipated that a user 
would re-map that or pass config in via environment variables (which could then 
include secrets). Both designed to work in docker and kubernetes. There's an 
example of doing that here: 
https://github.com/apache/flume/blob/d2bd7812dbacd86459726c0fd3dc774272ce0222/flume-ng-tests/src/test/java/org/apache/flume/test/util/DockerInstall.java#L137-L153
   
   > I think starting from "everything goes in the bucket" to match our 
historic deployment shape is fine. I agree that eventually we need to get to a 
more modular approach that provides easy examples for folks building just the 
parts they need.
   
   +1
   
   > There's some tooling around for easier container image building based on 
spring boot applications, e.g. Jib. Maybe we should take an approach that 
leverages that?
   
   Personally I'd rather not re-write the docker deployment right now given 
that what's there works pretty well. We could look to move away from the 
spotify plugin to something else, but I don't want to re-architect the whole 
packaging of Flume at the moment.
   
   > To be clear, I use the FileChannel, which pretty much requires fast disk 
(i.e. SSDs or equivalent) to perform well. It also requires a dedicated "disk" 
so that data isn't lost on restart. This doesn't really work well with 
Docker/Kubernetes so we don't use it for Flume.
   
   I actually think this would be fine - you can have persistent disks in 
Kubernetes assigned to pods, same applies to docker where you can mount a 
volume. Would I necessarily recommend that you re-write your deployment model 
to use containers? No. But for example, in @busbey 's world where he might be 
moving from a previously managed deployment to something that needs additional 
orchestration, using Docker or Kubernetes and deploying agents across many 
nodes, this could make things much easier to manage and maintain.
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@flume.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to