Re: [PR] feat: create builder for disk manager [datafusion]
xudong963 merged PR #16191: URL: https://github.com/apache/datafusion/pull/16191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
zhuqi-lucas commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2113090658
##
datafusion/execution/src/disk_manager.rs:
##
@@ -32,7 +32,95 @@ use crate::memory_pool::human_readable_size;
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024; // 100GB
+/// Builder pattern for the [DiskManager] structure
+#[derive(Clone, Debug)]
+pub struct DiskManagerBuilder {
+/// The storage mode of the disk manager
+mode: DiskManagerMode,
+/// The maximum amount of data (in bytes) stored inside the temporary
directories.
+/// Default to 100GB
+max_temp_directory_size: u64,
+}
+
+impl Default for DiskManagerBuilder {
+fn default() -> Self {
+Self {
+mode: DiskManagerMode::OsTmpDirectory,
+max_temp_directory_size: DEFAULT_MAX_TEMP_DIRECTORY_SIZE,
+}
+}
+}
+
+impl DiskManagerBuilder {
+pub fn set_mode(&mut self, mode: DiskManagerMode) {
+self.mode = mode;
+}
+
+pub fn with_mode(mut self, mode: DiskManagerMode) -> Self {
+self.set_mode(mode);
+self
+}
+
+pub fn set_max_temp_directory_size(&mut self, value: u64) {
+self.max_temp_directory_size = value;
+}
+
+pub fn with_max_temp_directory_size(mut self, value: u64) -> Self {
+self.set_max_temp_directory_size(value);
+self
+}
+
+/// Create a DiskManager given the builder
+pub fn build(self) -> Result {
+match self.mode {
+DiskManagerMode::OsTmpDirectory => Ok(DiskManager {
+local_dirs: Mutex::new(Some(vec![])),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+DiskManagerMode::Directories(conf_dirs) => {
+let local_dirs = create_local_dirs(conf_dirs)?;
+debug!(
+"Created local dirs {local_dirs:?} as DataFusion working
directory"
+);
+Ok(DiskManager {
+local_dirs: Mutex::new(Some(local_dirs)),
Review Comment:
Thanks , got it!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
alamb commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2112173381
##
datafusion/execution/src/disk_manager.rs:
##
@@ -91,6 +177,11 @@ pub struct DiskManager {
}
impl DiskManager {
+/// Creates a builder for [DiskManager]
+pub fn builder() -> DiskManagerBuilder {
+DiskManagerBuilder::default()
+}
+
/// Create a DiskManager given the configuration
pub fn try_new(config: DiskManagerConfig) -> Result> {
Review Comment:
thank you
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
jdrouet commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2111797754
##
datafusion/execution/src/disk_manager.rs:
##
@@ -32,7 +32,95 @@ use crate::memory_pool::human_readable_size;
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024; // 100GB
+/// Builder pattern for the [DiskManager] structure
+#[derive(Clone, Debug)]
+pub struct DiskManagerBuilder {
+/// The storage mode of the disk manager
+mode: DiskManagerMode,
+/// The maximum amount of data (in bytes) stored inside the temporary
directories.
+/// Default to 100GB
+max_temp_directory_size: u64,
+}
+
+impl Default for DiskManagerBuilder {
+fn default() -> Self {
+Self {
+mode: DiskManagerMode::OsTmpDirectory,
+max_temp_directory_size: DEFAULT_MAX_TEMP_DIRECTORY_SIZE,
+}
+}
+}
+
+impl DiskManagerBuilder {
+pub fn set_mode(&mut self, mode: DiskManagerMode) {
+self.mode = mode;
+}
+
+pub fn with_mode(mut self, mode: DiskManagerMode) -> Self {
+self.set_mode(mode);
+self
+}
+
+pub fn set_max_temp_directory_size(&mut self, value: u64) {
+self.max_temp_directory_size = value;
+}
+
+pub fn with_max_temp_directory_size(mut self, value: u64) -> Self {
+self.set_max_temp_directory_size(value);
+self
+}
+
+/// Create a DiskManager given the builder
+pub fn build(self) -> Result {
+match self.mode {
+DiskManagerMode::OsTmpDirectory => Ok(DiskManager {
+local_dirs: Mutex::new(Some(vec![])),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+DiskManagerMode::Directories(conf_dirs) => {
+let local_dirs = create_local_dirs(conf_dirs)?;
+debug!(
+"Created local dirs {local_dirs:?} as DataFusion working
directory"
+);
+Ok(DiskManager {
+local_dirs: Mutex::new(Some(local_dirs)),
Review Comment:
Actually, I just moved the existing to a builder. So right now, we have the
same behavior as before, meaning that each dir has the same limit.
For more details, you'd have to go back to the `current_file_disk_usage`
computation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
jdrouet commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2111306335
##
datafusion/execution/src/disk_manager.rs:
##
@@ -32,6 +32,92 @@ use crate::memory_pool::human_readable_size;
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024; // 100GB
+/// Builder pattern for the [DiskManager] structure
+#[derive(Clone, Debug)]
+pub struct DiskManagerBuilder {
+/// The storage mode of the disk manager
+mode: DiskManagerMode,
+/// The maximum amount of data (in bytes) stored inside the temporary
directories.
+/// Default to 100GB
+max_temp_directory_size: u64,
+}
+
+impl Default for DiskManagerBuilder {
+fn default() -> Self {
+Self {
+mode: DiskManagerMode::OsTmpDirectory,
+max_temp_directory_size: DEFAULT_MAX_TEMP_DIRECTORY_SIZE,
+}
+}
+}
+
+impl DiskManagerBuilder {
+pub fn set_mode(&mut self, mode: DiskManagerMode) {
+self.mode = mode;
+}
+
+pub fn with_mode(mut self, mode: DiskManagerMode) -> Self {
+self.set_mode(mode);
+self
+}
+
+pub fn set_max_temp_directory_size(&mut self, value: u64) {
+self.max_temp_directory_size = value;
+}
+
+pub fn with_max_temp_directory_size(mut self, value: u64) -> Self {
+self.set_max_temp_directory_size(value);
+self
+}
+
+/// Create a DiskManager given the builder
+pub fn build(self) -> Result {
+match self.mode {
+DiskManagerMode::OsTmpDirectory => Ok(DiskManager {
+local_dirs: Mutex::new(Some(vec![])),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+DiskManagerMode::Directories(conf_dirs) => {
+let local_dirs = create_local_dirs(conf_dirs)?;
+debug!(
+"Created local dirs {local_dirs:?} as DataFusion working
directory"
+);
+Ok(DiskManager {
+local_dirs: Mutex::new(Some(local_dirs)),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+})
+}
+DiskManagerMode::Disabled => Ok(DiskManager {
+local_dirs: Mutex::new(None),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+}
+}
+}
+
+#[derive(Clone, Debug)]
+pub enum DiskManagerMode {
+/// Create a new [DiskManager] that creates temporary files within
+/// a temporary directory chosen by the OS
+OsTmpDirectory,
+
+/// Create a new [DiskManager] that creates temporary files within
+/// the specified directories
Review Comment:
addressed in
https://github.com/apache/datafusion/pull/16191/commits/fa4552a010c71526115e5e08dd1165da3d400351
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
zhuqi-lucas commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2111412349
##
datafusion/execution/src/disk_manager.rs:
##
@@ -32,7 +32,95 @@ use crate::memory_pool::human_readable_size;
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024; // 100GB
+/// Builder pattern for the [DiskManager] structure
+#[derive(Clone, Debug)]
+pub struct DiskManagerBuilder {
+/// The storage mode of the disk manager
+mode: DiskManagerMode,
+/// The maximum amount of data (in bytes) stored inside the temporary
directories.
+/// Default to 100GB
+max_temp_directory_size: u64,
+}
+
+impl Default for DiskManagerBuilder {
+fn default() -> Self {
+Self {
+mode: DiskManagerMode::OsTmpDirectory,
+max_temp_directory_size: DEFAULT_MAX_TEMP_DIRECTORY_SIZE,
+}
+}
+}
+
+impl DiskManagerBuilder {
+pub fn set_mode(&mut self, mode: DiskManagerMode) {
+self.mode = mode;
+}
+
+pub fn with_mode(mut self, mode: DiskManagerMode) -> Self {
+self.set_mode(mode);
+self
+}
+
+pub fn set_max_temp_directory_size(&mut self, value: u64) {
+self.max_temp_directory_size = value;
+}
+
+pub fn with_max_temp_directory_size(mut self, value: u64) -> Self {
+self.set_max_temp_directory_size(value);
+self
+}
+
+/// Create a DiskManager given the builder
+pub fn build(self) -> Result {
+match self.mode {
+DiskManagerMode::OsTmpDirectory => Ok(DiskManager {
+local_dirs: Mutex::new(Some(vec![])),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+DiskManagerMode::Directories(conf_dirs) => {
+let local_dirs = create_local_dirs(conf_dirs)?;
+debug!(
+"Created local dirs {local_dirs:?} as DataFusion working
directory"
+);
+Ok(DiskManager {
+local_dirs: Mutex::new(Some(local_dirs)),
Review Comment:
Thank you @jdrouet for the work, LGTM, and minor question, do we have each
dir max limit config when we have multi dirs?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
jdrouet commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2111307101
##
datafusion/execution/src/disk_manager.rs:
##
@@ -91,6 +177,11 @@ pub struct DiskManager {
}
impl DiskManager {
+/// Creates a builder for [DiskManager]
+pub fn builder() -> DiskManagerBuilder {
+DiskManagerBuilder::default()
+}
+
/// Create a DiskManager given the configuration
pub fn try_new(config: DiskManagerConfig) -> Result> {
Review Comment:
Addressed in
https://github.com/apache/datafusion/pull/16191/commits/9baa5c2e8cbd8dbdc903a258716034104ba23d33
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
2010YOUY01 commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2110920091
##
datafusion/execution/src/disk_manager.rs:
##
@@ -32,6 +32,92 @@ use crate::memory_pool::human_readable_size;
const DEFAULT_MAX_TEMP_DIRECTORY_SIZE: u64 = 100 * 1024 * 1024 * 1024; // 100GB
+/// Builder pattern for the [DiskManager] structure
+#[derive(Clone, Debug)]
+pub struct DiskManagerBuilder {
+/// The storage mode of the disk manager
+mode: DiskManagerMode,
+/// The maximum amount of data (in bytes) stored inside the temporary
directories.
+/// Default to 100GB
+max_temp_directory_size: u64,
+}
+
+impl Default for DiskManagerBuilder {
+fn default() -> Self {
+Self {
+mode: DiskManagerMode::OsTmpDirectory,
+max_temp_directory_size: DEFAULT_MAX_TEMP_DIRECTORY_SIZE,
+}
+}
+}
+
+impl DiskManagerBuilder {
+pub fn set_mode(&mut self, mode: DiskManagerMode) {
+self.mode = mode;
+}
+
+pub fn with_mode(mut self, mode: DiskManagerMode) -> Self {
+self.set_mode(mode);
+self
+}
+
+pub fn set_max_temp_directory_size(&mut self, value: u64) {
+self.max_temp_directory_size = value;
+}
+
+pub fn with_max_temp_directory_size(mut self, value: u64) -> Self {
+self.set_max_temp_directory_size(value);
+self
+}
+
+/// Create a DiskManager given the builder
+pub fn build(self) -> Result {
+match self.mode {
+DiskManagerMode::OsTmpDirectory => Ok(DiskManager {
+local_dirs: Mutex::new(Some(vec![])),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+DiskManagerMode::Directories(conf_dirs) => {
+let local_dirs = create_local_dirs(conf_dirs)?;
+debug!(
+"Created local dirs {local_dirs:?} as DataFusion working
directory"
+);
+Ok(DiskManager {
+local_dirs: Mutex::new(Some(local_dirs)),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+})
+}
+DiskManagerMode::Disabled => Ok(DiskManager {
+local_dirs: Mutex::new(None),
+max_temp_directory_size: self.max_temp_directory_size,
+used_disk_space: Arc::new(AtomicU64::new(0)),
+}),
+}
+}
+}
+
+#[derive(Clone, Debug)]
+pub enum DiskManagerMode {
+/// Create a new [DiskManager] that creates temporary files within
+/// a temporary directory chosen by the OS
+OsTmpDirectory,
+
+/// Create a new [DiskManager] that creates temporary files within
+/// the specified directories
Review Comment:
```suggestion
/// the specified directories. One of the directories will be chosen
/// at random for each temporary file created.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
alamb commented on code in PR #16191:
URL: https://github.com/apache/datafusion/pull/16191#discussion_r2109919623
##
datafusion/execution/src/disk_manager.rs:
##
@@ -91,6 +177,11 @@ pub struct DiskManager {
}
impl DiskManager {
+/// Creates a builder for [DiskManager]
+pub fn builder() -> DiskManagerBuilder {
+DiskManagerBuilder::default()
+}
+
/// Create a DiskManager given the configuration
pub fn try_new(config: DiskManagerConfig) -> Result> {
Review Comment:
What do you think about Deprecating `DiskManagerConfig` and
`DiskManager::try_new` so we have a path to a single way to configure
DiskManagers?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
jdrouet commented on PR #16191: URL: https://github.com/apache/datafusion/pull/16191#issuecomment-2909600032 Correct me if I'm wrong, but the failing test doesn't seem related. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] feat: create builder for disk manager [datafusion]
jdrouet closed pull request #15975: feat: create builder for disk manager URL: https://github.com/apache/datafusion/pull/15975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
