Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
tustvold closed pull request #432: Add TracingStore wrapper implementation URL: https://github.com/apache/arrow-rs-object-store/pull/432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
asubiotto commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3612395250 @m09526, just want to say that I appreciate the work you put into this PR. However, I still think the right way to achieve the goals this PR wants is to move `instrumented-object-store` into `object_store` since this is used in production by multiple users already (Polar Signals and Datadog AFAIK), so I propose to close this PR and open a new one implementing this move. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
alamb commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3611962244 What shall we do with this PR? I am getting ready to make an object store 0.13.0 release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
geoffreyclaude commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3180215176 > the plain versions of `get` and `put` are just wrappers around those functions. Not necessarily: you can very well have a particular implementation that overrides the Trait's `get` and `put` defaults without calling `get_opts` or `put_opts`. In which case it won't get traced? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
m09526 commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3178878858 The original implementation of this code which was based on the `log` crate has been tested in our code for nearly a year now, so we're confident it works as expected. > Moreover, from a quick look at this PR it seems to be missing functionality of `instrumented-object-store`, e.g. tracing of operation results, and the `get` and `put` calls? We deliberately only instrumented the non-provided trait functions since the plain versions of `get` and `put` are just wrappers around those functions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
geoffreyclaude commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3160279122 > I think this looks good to me -- thank you @m09526 > > > @geoffreyclaude and @asubiotto are you happy with this implementation as well? @alamb I agree with @asubiotto that moving `instrumented-object-store` as is here would probably be simpler, if only because it is already validated in (our) production setup. Then if there are changes to do for better user experience or performance, we can iterate on the code. Moreover, from a quick look at this PR it seems to be missing functionality of `instrumented-object-store`, e.g. tracing of operation results, and the `get` and `put` calls? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
asubiotto commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3133759594 AFAIU this PR reimplements functionality that already exists in `instrumented-object-store` (minus some missing span tags and hardcoding the debug level). I personally would prefer to see `instrumented-object-store` move into `object_store` and out of `datafusion-contrib` rather than have two implementations of the same thing in my dependencies but I don't care that much. Happy with whatever everyone else is happy with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
m09526 commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3083830965 How does that version look? I've made some modifications to avoid the use of single events and moved to using spans. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
m09526 commented on code in PR #432:
URL:
https://github.com/apache/arrow-rs-object-store/pull/432#discussion_r2213162540
##
src/trace.rs:
##
@@ -0,0 +1,578 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! An object store that traces calls to the wrapped implementation.
+use crate::{
+path::Path, GetOptions, GetRange, GetResult, ListResult, MultipartUpload,
ObjectMeta,
+ObjectStore, PutMultipartOptions, PutOptions, PutPayload, PutResult,
Result, UploadPart,
+};
+use async_trait::async_trait;
+use futures::stream::BoxStream;
+use tracing::debug;
+
+/// An [`ObjectStore`] wrapper that traces operations made to the wrapped
store.
+#[derive(Debug)]
+pub struct TracingStore {
+store: T,
+prefix: String,
+path_prefix: String,
+}
+
+impl TracingStore {
+/// Create a new tracing store by wrapping an inner store.
+#[must_use]
+pub fn new(inner: T, prefix: impl Into, path_prefix: impl
Into) -> Self {
+Self {
+store: inner,
+prefix: prefix.into(),
+path_prefix: path_prefix.into(),
+}
+}
+}
+
+impl std::fmt::Display for TracingStore {
+fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+write!(
+f,
+"TracingStore \"{}\" path prefix: \"{}\" ({})",
+self.prefix, self.path_prefix, self.store
+)
+}
+}
+
+#[async_trait]
+impl ObjectStore for TracingStore {
+async fn get_opts(&self, location: &Path, options: GetOptions) ->
Result {
+if !options.head {
+match &options.range {
+Some(GetRange::Bounded(get_range)) => {
+let len = get_range
+.end
+.checked_sub(get_range.start)
+.expect("Get range length is negative");
+debug!(
+"{} get request for {}/{} byte range {} to {} = {}
bytes",
+self.prefix,
+self.path_prefix,
+location,
+get_range.start,
+get_range.end,
+len,
+);
+}
+Some(GetRange::Offset(start_pos)) => {
+debug!(
+"{} get request for {}/{} for byte {} to EOF",
+self.prefix, self.path_prefix, location, start_pos,
+);
+}
+Some(GetRange::Suffix(pos)) => {
+debug!(
+"{} get request for {}/{} for last {} bytes of object",
+self.prefix, self.path_prefix, location, pos,
+);
+}
+None => {
+debug!(
+"{} get request for {}/{} for complete file range",
+self.prefix, self.path_prefix, location
+);
+}
+}
+}
+self.store.get_opts(location, options).await
+}
+
+async fn head(&self, location: &Path) -> Result {
+debug!(
+"{} head request for {}/{}",
+self.prefix, self.path_prefix, location
+);
Review Comment:
I've made some changes to use spans instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
geoffreyclaude commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3061579918 > I'm happy with anything, but I feel like tracing is a better fit for instrumenting these kinds of requests and it's nice to have a canonical version. I would probably pull `instrumented-object-store` into this repo since it doesn't have anything to do with datafusion (I think?) and make it a feature of the object_store crate. Absolutely, that makes a lot of sense as well! Feel free to copy `instrumented-object-store` over here if that's what you prefer, and I'll stop publishing it once `datafusion` depends on an object-store version that incorporates it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
asubiotto commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3061414713 I'm happy with anything, but I feel like tracing is a better for for instrumenting these kinds of requests and it's nice to have a canonical version. I would probably pull `instrumented-object-store` into this repo since it doesn't have anything to do with datafusion (I think) and make it a feature of the object_store crate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
geoffreyclaude commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3061336967 > FWIW, https://github.com/datafusion-contrib/datafusion-tracing/tree/main/instrumented-object-store exists and seems to work well with our use of datafusion. Just wanted to point it out to reduce duplication of efforts. cc @geoffreyclaude 👋 Hi! `instrumented-object-store` uses the `tracing` crate to wrap oject-store operations in spans, and is published as a standalone crate so as not to pull in yet another dependency into `arrow-rs-object-store` or `datafusion`. If you want just basic debug logging, I'd say this PR is complimentary. All these wrappers are short and simple single file code anyways, so it's probably fine to have multiple different versions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
asubiotto commented on PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#issuecomment-3061212713 FWIW, https://github.com/datafusion-contrib/datafusion-tracing/tree/main/instrumented-object-store exists and seems to work well with our use of datafusion. Just wanted to point it out to reduce duplication of efforts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
mbrobbel commented on code in PR #432:
URL:
https://github.com/apache/arrow-rs-object-store/pull/432#discussion_r2198592833
##
src/trace.rs:
##
@@ -0,0 +1,578 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! An object store that traces calls to the wrapped implementation.
+use crate::{
+path::Path, GetOptions, GetRange, GetResult, ListResult, MultipartUpload,
ObjectMeta,
+ObjectStore, PutMultipartOptions, PutOptions, PutPayload, PutResult,
Result, UploadPart,
+};
+use async_trait::async_trait;
+use futures::stream::BoxStream;
+use tracing::debug;
+
+/// An [`ObjectStore`] wrapper that traces operations made to the wrapped
store.
+#[derive(Debug)]
+pub struct TracingStore {
+store: T,
+prefix: String,
+path_prefix: String,
+}
+
+impl TracingStore {
+/// Create a new tracing store by wrapping an inner store.
+#[must_use]
+pub fn new(inner: T, prefix: impl Into, path_prefix: impl
Into) -> Self {
+Self {
+store: inner,
+prefix: prefix.into(),
+path_prefix: path_prefix.into(),
+}
+}
+}
+
+impl std::fmt::Display for TracingStore {
+fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+write!(
+f,
+"TracingStore \"{}\" path prefix: \"{}\" ({})",
+self.prefix, self.path_prefix, self.store
+)
+}
+}
+
+#[async_trait]
+impl ObjectStore for TracingStore {
+async fn get_opts(&self, location: &Path, options: GetOptions) ->
Result {
+if !options.head {
+match &options.range {
+Some(GetRange::Bounded(get_range)) => {
+let len = get_range
+.end
+.checked_sub(get_range.start)
+.expect("Get range length is negative");
+debug!(
+"{} get request for {}/{} byte range {} to {} = {}
bytes",
+self.prefix,
+self.path_prefix,
+location,
+get_range.start,
+get_range.end,
+len,
+);
+}
+Some(GetRange::Offset(start_pos)) => {
+debug!(
+"{} get request for {}/{} for byte {} to EOF",
+self.prefix, self.path_prefix, location, start_pos,
+);
+}
+Some(GetRange::Suffix(pos)) => {
+debug!(
+"{} get request for {}/{} for last {} bytes of object",
+self.prefix, self.path_prefix, location, pos,
+);
+}
+None => {
+debug!(
+"{} get request for {}/{} for complete file range",
+self.prefix, self.path_prefix, location
+);
+}
+}
+}
+self.store.get_opts(location, options).await
+}
+
+async fn head(&self, location: &Path) -> Result {
+debug!(
+"{} head request for {}/{}",
+self.prefix, self.path_prefix, location
+);
Review Comment:
Instead of debug level events, maybe we should use spans?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
m09526 commented on code in PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#discussion_r2197143903 ## Cargo.toml: ## @@ -37,6 +37,7 @@ futures = "0.3" http = "1.2.0" humantime = "2.1" itertools = "0.14.0" +log = "0.4.27" Review Comment: Thanks for the suggestion! The PR has been updated to reflect this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Add TracingStore wrapper implementation [arrow-rs-object-store]
m09526 commented on code in PR #432: URL: https://github.com/apache/arrow-rs-object-store/pull/432#discussion_r2197143903 ## Cargo.toml: ## @@ -37,6 +37,7 @@ futures = "0.3" http = "1.2.0" humantime = "2.1" itertools = "0.14.0" +log = "0.4.27" Review Comment: Thanks for the suggestion, The PR has been updated to reflect this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
