sebbegg commented on PR #1351:
URL: 
https://github.com/apache/datafusion-ballista/pull/1351#issuecomment-3746283087

   > thanks @sebbegg,
   > 
   > just to clarify, we can have three configuration options:
   > 
   > * proxy not configured, client needs to fetc data from executors
   > * proxy configured, no ip address or port provided, scheduler needs to 
start proxy on the same port (withing process)
   > * proxy configured ip/port provided, scheduler considers this as extenal 
process running proxy, it just needs to put that value in the response, 
scheduler will not start proxy. client needs to use that ip/port combination to 
connect to process
   
   If I get this right, the last variant would mean we don't need this block, 
right?
   
   
https://github.com/sebbegg/datafusion-ballista/blob/5022263904c37d660bc77e3f5c065206b6720d20/ballista/scheduler/src/scheduler_process.rs#L202-L212
   
   How would you then start this external process?
   I guess we could add another crate/binary at `ballista/flight-proxy`?
   
   Starting a cluster could then look like:
   
   * `./ballista-flight-proxy --bind-host localhost --bind-port 50040`
   * `./ballista-scheduler --advertise-flight-sql-endpoint localhost:50040`
   * `./ballista-executor --scheduler-host localhost --scheduler-port 50050`
   
   I guess it's smart because like this all services can be run independently.
   
   As far as I can tell all the scheduler-state is in-memory right?
   So in this setup we could e.g. not perform the check whether the requested 
data / executor-host is actually alive and belongs to the cluster.
   On the other hand, it would make the proxy stateless, which is probably a 
good thing.
   
   I wonder though, whether it's worthwhile to add the possibility (and hence 
the complexity in the cli & protobuf) of running the flight-proxy "embedded" in 
the scheduler?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to