alamb commented on a change in pull request #1924:
URL: https://github.com/apache/arrow-datafusion/pull/1924#discussion_r821100225



##########
File path: datafusion/src/execution/runtime_env.rs
##########
@@ -25,47 +25,110 @@ use crate::{
         memory_manager::{MemoryConsumerId, MemoryManager, MemoryManagerConfig},
     },
 };
-
+use std::fmt;
 use std::fmt::{Debug, Formatter};
+
+use crate::datasource::object_store::{ObjectStore, ObjectStoreRegistry};
+use crate::execution::context::{
+    SessionContextRegistry, TaskContext, TaskContextRegistry, BATCH_SIZE,
+    PARQUET_PRUNING, REPARTITION_AGGREGATIONS, REPARTITION_JOINS, 
REPARTITION_WINDOWS,
+    TARGET_PARTITIONS,
+};
+use crate::prelude::{SessionConfig, SessionContext};
+use datafusion_common::DataFusionError;
+use once_cell::sync::OnceCell;
+use parking_lot::Mutex;
+use std::path::PathBuf;
 use std::sync::Arc;
 
-#[derive(Clone)]
-/// Execution runtime environment. This structure is passed to the
-/// physical plans when they are run.
+/// Global singleton RuntimeEnv
+pub static RUNTIME_ENV: OnceCell<Arc<RuntimeEnv>> = OnceCell::new();

Review comment:
       Yes, I agree the alternative is having to pass through some sort of 
context (like the `ExecutionContext`) explicitly everywhere, which may make the 
code harder to read. However the upside of explicitly passing through any 
shared state is that there are no surprising behaviors
   
   > I think even we just use DataFusion as a lib, it is still a valid 
requirement to make the ExecutionPlan configuration aware. We might add more 
and more configuration options to enable/disable some feature during plan time 
and execution runtime(System like SparkSQL/Presto they have more than hundreds 
of configuration options).
   
   Yes I agree having `ExecutionPlan` be configuration aware is important. 
   
   The idea up to this point has been to explicitly copy / plumb through this 
state (via  `Arc<ExecutionContextState>`)
   
https://github.com/apache/arrow-datafusion/blob/7a2cbd5500c1e9447c7d71599eeccfdd5833cd4e/datafusion%2Fsrc%2Fexecution%2Fcontext.rs#L148
   
   And  the `RuntimeEnv` on the call to `ExecutionPlan::execute`
   
   
https://github.com/apache/arrow-datafusion/blob/33b9357139ad918da0f45a92db37f00ffa64b0ba/datafusion/src/physical_plan/mod.rs#L226-L230
   
   
   What would you think about adding configuration options to RuntimeEnv (or 
something else more general)?
   
https://github.com/apache/arrow-datafusion/blob/641338f726549c10c5bafee34537dc1e56cdec04/datafusion/src/execution/runtime_env.rs#L35-L42




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to