Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
alamb merged PR #8762: URL: https://github.com/apache/arrow-rs/pull/8762 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
alamb commented on PR #8762: URL: https://github.com/apache/arrow-rs/pull/8762#issuecomment-3667006113 since @etseidl has already approved this too I'll just merge it in Thanks again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
ODukhno commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2621241936
##
parquet/src/bloom_filter/mod.rs:
##
@@ -541,4 +599,49 @@ mod tests {
assert_eq!(*num_bits, num_of_bits_from_ndv_fpp(*ndv, *fpp) as u64);
}
}
+
+#[test]
+fn test_sbbf_write_round_trip() {
+// Create a bloom filter with a 32-byte bitset (minimum size)
+let bitset_bytes = vec![0u8; 32];
+let mut original = Sbbf::new(&bitset_bytes);
+
+// Insert some test values
+let test_values = ["hello", "world", "rust", "parquet", "bloom",
"filter"];
+for value in &test_values {
+original.insert(value);
+}
+
+// Serialize to bytes
+let mut output = Vec::new();
+original.write(&mut output).unwrap();
+
+// Validate header was written correctly
+let mut protocol = ThriftSliceInputProtocol::new(&output);
+let header = BloomFilterHeader::read_thrift(&mut protocol).unwrap();
+assert_eq!(header.num_bytes, bitset_bytes.len() as i32);
+assert_eq!(header.algorithm, BloomFilterAlgorithm::BLOCK);
+assert_eq!(header.hash, BloomFilterHash::XXHASH);
+assert_eq!(header.compression, BloomFilterCompression::UNCOMPRESSED);
+
+// Deserialize using from_bytes
+let reconstructed = Sbbf::from_bytes(&output).unwrap();
+
+// Most importantly: verify the bloom filter WORKS correctly after
round-trip
+for value in &test_values {
+assert!(
+reconstructed.check(value),
+"Value '{}' should be present after round-trip",
+value
+);
+}
+
+// Verify false negative check (values not inserted should not be
found)
+let missing_values = ["missing", "absent", "nothere"];
+for value in &missing_values {
+// Note: bloom filters can have false positives, but should never
have false negatives
+// So we can't assert !check(), but we should verify inserted
values are found
+let _ = reconstructed.check(value); // Just exercise the code path
Review Comment:
Just pushed another change where I applied this one along with all other
suggestions above.
Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
etseidl commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2620481961
##
parquet/src/bloom_filter/mod.rs:
##
@@ -417,6 +427,54 @@ impl Sbbf {
pub(crate) fn estimated_memory_size(&self) -> usize {
self.0.capacity() * std::mem::size_of::()
}
+
+/// reads a Sbff from thrift encoded bytes
Review Comment:
```suggestion
/// Reads a Sbff from Thrift encoded bytes
```
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +292,8 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+/// This method usually is used in conjunction with from_bytes for
serialization/deserialization.
Review Comment:
```suggestion
/// This method usually is used in conjunction with [`Self::from_bytes`]
for serialization/deserialization.
```
##
parquet/src/bloom_filter/mod.rs:
##
@@ -215,9 +215,18 @@ pub(crate) fn chunk_read_bloom_filter_header_and_offset(
#[inline]
pub(crate) fn read_bloom_filter_header_and_length(
buffer: Bytes,
+) -> Result<(BloomFilterHeader, u64), ParquetError> {
+read_bloom_filter_header_and_length_from_bytes(buffer.as_ref())
+}
+
+/// given a byte slice, try to read out a bloom filter header and return both
the header and
Review Comment:
```suggestion
/// Given a byte slice, try to read out a bloom filter header and return
both the header and
```
##
parquet/src/bloom_filter/mod.rs:
##
@@ -541,4 +599,49 @@ mod tests {
assert_eq!(*num_bits, num_of_bits_from_ndv_fpp(*ndv, *fpp) as u64);
}
}
+
+#[test]
+fn test_sbbf_write_round_trip() {
+// Create a bloom filter with a 32-byte bitset (minimum size)
+let bitset_bytes = vec![0u8; 32];
+let mut original = Sbbf::new(&bitset_bytes);
+
+// Insert some test values
+let test_values = ["hello", "world", "rust", "parquet", "bloom",
"filter"];
+for value in &test_values {
+original.insert(value);
+}
+
+// Serialize to bytes
+let mut output = Vec::new();
+original.write(&mut output).unwrap();
+
+// Validate header was written correctly
+let mut protocol = ThriftSliceInputProtocol::new(&output);
+let header = BloomFilterHeader::read_thrift(&mut protocol).unwrap();
+assert_eq!(header.num_bytes, bitset_bytes.len() as i32);
+assert_eq!(header.algorithm, BloomFilterAlgorithm::BLOCK);
+assert_eq!(header.hash, BloomFilterHash::XXHASH);
+assert_eq!(header.compression, BloomFilterCompression::UNCOMPRESSED);
+
+// Deserialize using from_bytes
+let reconstructed = Sbbf::from_bytes(&output).unwrap();
+
+// Most importantly: verify the bloom filter WORKS correctly after
round-trip
+for value in &test_values {
+assert!(
+reconstructed.check(value),
+"Value '{}' should be present after round-trip",
+value
+);
+}
+
+// Verify false negative check (values not inserted should not be
found)
+let missing_values = ["missing", "absent", "nothere"];
+for value in &missing_values {
+// Note: bloom filters can have false positives, but should never
have false negatives
+// So we can't assert !check(), but we should verify inserted
values are found
+let _ = reconstructed.check(value); // Just exercise the code path
Review Comment:
Since we can't verify the negative test, why not just move these comments up
and skip this loop?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
ODukhno commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2611731221
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +300,7 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+pub fn write(&self, mut writer: W) -> Result<(), ParquetError> {
Review Comment:
@alamb , please take another look. I took over this work as Rose will not be
able to work on this task anymore.
Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
ODukhno commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2611494866
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +300,7 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+pub fn write(&self, mut writer: W) -> Result<(), ParquetError> {
Review Comment:
@alamb , are you ok if I initiate new pull request? Rose will not be able to
work on this change anymore.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
ODukhno commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2611494866
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +300,7 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+pub fn write(&self, mut writer: W) -> Result<(), ParquetError> {
Review Comment:
@alamb , are you ok if I initiate new pull request? Rose will not be able to
work on this change anymore.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
ODukhno commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2611494866
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +300,7 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+pub fn write(&self, mut writer: W) -> Result<(), ParquetError> {
Review Comment:
@alamb , are you ok if I initiate new pull request? Rose will work on this
change anymore.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
alamb commented on PR #8762: URL: https://github.com/apache/arrow-rs/pull/8762#issuecomment-3618701438 > @alamb Checking to see if you have any more comments/questions on this PR? Hi @RoseZhang123 -- I had a few suggestions on the API here: https://github.com/apache/arrow-rs/pull/8762#pullrequestreview-3537207043 If that doesn't make sense, let me know and I can try and propose some changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
RoseZhang123 commented on PR #8762: URL: https://github.com/apache/arrow-rs/pull/8762#issuecomment-3618246955 @alamb Checking to see if you have any more comments/questions on this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
alamb commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2586829463
##
parquet/src/bloom_filter/mod.rs:
##
@@ -283,7 +300,7 @@ impl Sbbf {
/// Write the bloom filter data (header and then bitset) to the output.
This doesn't
/// flush the writer in order to boost performance of bulk writing all
blocks. Caller
/// must remember to flush the writer.
-pub(crate) fn write(&self, mut writer: W) -> Result<(),
ParquetError> {
+pub fn write(&self, mut writer: W) -> Result<(), ParquetError> {
Review Comment:
This method writes out the bytes using thrift encoding, which is basically
```text
(header)
(bitset)
```
We already have the code to read a `Sbff` back from the thrift encoding here
https://github.com/apache/arrow-rs/blob/b65d20767c7510570cff0ab0154a268c29a3f712/parquet/src/bloom_filter/mod.rs#L409-L408
Rather than exposing `new()` and relying on the user having to pass in the
right bits (and relying on the bloom filter code not to change), what I think
we should do:
1. Introduce a `read()` method that is the inverse of `write()` --
specifically, something like
```rust
/// reads a Sbff from thrift encoded bytes
pub fn from_bytes(bytes: &[u8]) -> Result {
...
}
```
And then a round trip serialization would look like:
```rust
let mut serialized = Vec::new();
original.write(&mut serialized)?;
// read the serialized bytes back
let reconstructed = Sbff::from_bytes(&serialized)?;
```
##
parquet/src/bloom_filter/mod.rs:
##
@@ -292,6 +309,44 @@ impl Sbbf {
Ok(())
}
+/// Returns the raw bitset bytes encoded in little-endian order.
+pub fn as_slice(&self) -> Vec {
Review Comment:
typically methods named starting with `as_*` would not copy data
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
RoseZhang123 commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2550843912
##
parquet/src/bloom_filter/mod.rs:
##
@@ -244,29 +244,8 @@ fn num_of_bits_from_ndv_fpp(ndv: u64, fpp: f64) -> usize {
}
impl Sbbf {
-/// Create a new [Sbbf] with given number of distinct values and false
positive probability.
-/// Will return an error if `fpp` is greater than or equal to 1.0 or less
than 0.0.
-pub(crate) fn new_with_ndv_fpp(ndv: u64, fpp: f64) -> Result {
-if !(0.0..1.0).contains(&fpp) {
-return Err(ParquetError::General(format!(
-"False positive probability must be between 0.0 and 1.0, got
{fpp}"
-)));
-}
-let num_bits = num_of_bits_from_ndv_fpp(ndv, fpp);
-Ok(Self::new_with_num_of_bytes(num_bits / 8))
-}
-
-/// Create a new [Sbbf] with given number of bytes, the exact number of
bytes will be adjusted
-/// to the next power of two bounded by [BITSET_MIN_LENGTH] and
[BITSET_MAX_LENGTH].
-pub(crate) fn new_with_num_of_bytes(num_bytes: usize) -> Self {
-let num_bytes = optimal_num_of_bytes(num_bytes);
-assert_eq!(num_bytes % size_of::(), 0);
-let num_blocks = num_bytes / size_of::();
-let bitset = vec![Block::ZERO; num_blocks];
-Self(bitset)
-}
-
-pub(crate) fn new(bitset: &[u8]) -> Self {
+/// Create a new [Sbbf] from raw bitset bytes.
Review Comment:
Sounds good. Decided to add one more function to make things easier. Now
bitset_len() returns the size of the bloom filter’s bitset in bytes, and
as_slice() returns the bitset.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
etseidl commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2527904585
##
parquet/src/bloom_filter/mod.rs:
##
@@ -244,29 +244,8 @@ fn num_of_bits_from_ndv_fpp(ndv: u64, fpp: f64) -> usize {
}
impl Sbbf {
-/// Create a new [Sbbf] with given number of distinct values and false
positive probability.
-/// Will return an error if `fpp` is greater than or equal to 1.0 or less
than 0.0.
-pub(crate) fn new_with_ndv_fpp(ndv: u64, fpp: f64) -> Result {
-if !(0.0..1.0).contains(&fpp) {
-return Err(ParquetError::General(format!(
-"False positive probability must be between 0.0 and 1.0, got
{fpp}"
-)));
-}
-let num_bits = num_of_bits_from_ndv_fpp(ndv, fpp);
-Ok(Self::new_with_num_of_bytes(num_bits / 8))
-}
-
-/// Create a new [Sbbf] with given number of bytes, the exact number of
bytes will be adjusted
-/// to the next power of two bounded by [BITSET_MIN_LENGTH] and
[BITSET_MAX_LENGTH].
-pub(crate) fn new_with_num_of_bytes(num_bytes: usize) -> Self {
-let num_bytes = optimal_num_of_bytes(num_bytes);
-assert_eq!(num_bytes % size_of::(), 0);
-let num_blocks = num_bytes / size_of::();
-let bitset = vec![Block::ZERO; num_blocks];
-Self(bitset)
-}
-
-pub(crate) fn new(bitset: &[u8]) -> Self {
+/// Create a new [Sbbf] from raw bitset bytes.
Review Comment:
Would it be better to add a public function `as_slice` to return the bitset?
Reading the proposed example I'm not sure how one would get the header length
from the serialized `Sbbf`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
etseidl commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2527904585
##
parquet/src/bloom_filter/mod.rs:
##
@@ -244,29 +244,8 @@ fn num_of_bits_from_ndv_fpp(ndv: u64, fpp: f64) -> usize {
}
impl Sbbf {
-/// Create a new [Sbbf] with given number of distinct values and false
positive probability.
-/// Will return an error if `fpp` is greater than or equal to 1.0 or less
than 0.0.
-pub(crate) fn new_with_ndv_fpp(ndv: u64, fpp: f64) -> Result {
-if !(0.0..1.0).contains(&fpp) {
-return Err(ParquetError::General(format!(
-"False positive probability must be between 0.0 and 1.0, got
{fpp}"
-)));
-}
-let num_bits = num_of_bits_from_ndv_fpp(ndv, fpp);
-Ok(Self::new_with_num_of_bytes(num_bits / 8))
-}
-
-/// Create a new [Sbbf] with given number of bytes, the exact number of
bytes will be adjusted
-/// to the next power of two bounded by [BITSET_MIN_LENGTH] and
[BITSET_MAX_LENGTH].
-pub(crate) fn new_with_num_of_bytes(num_bytes: usize) -> Self {
-let num_bytes = optimal_num_of_bytes(num_bytes);
-assert_eq!(num_bytes % size_of::(), 0);
-let num_blocks = num_bytes / size_of::();
-let bitset = vec![Block::ZERO; num_blocks];
-Self(bitset)
-}
-
-pub(crate) fn new(bitset: &[u8]) -> Self {
+/// Create a new [Sbbf] from raw bitset bytes.
Review Comment:
Would it be better to add a public function `as_slice` to return the bitset?
Reading the proposed example I'm not sure how one would get the header length
from the deserialized `Sbbf`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
RoseZhang123 commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2501261532
##
parquet/src/bloom_filter/mod.rs:
##
@@ -244,29 +244,8 @@ fn num_of_bits_from_ndv_fpp(ndv: u64, fpp: f64) -> usize {
}
impl Sbbf {
-/// Create a new [Sbbf] with given number of distinct values and false
positive probability.
-/// Will return an error if `fpp` is greater than or equal to 1.0 or less
than 0.0.
-pub(crate) fn new_with_ndv_fpp(ndv: u64, fpp: f64) -> Result {
-if !(0.0..1.0).contains(&fpp) {
-return Err(ParquetError::General(format!(
-"False positive probability must be between 0.0 and 1.0, got
{fpp}"
-)));
-}
-let num_bits = num_of_bits_from_ndv_fpp(ndv, fpp);
-Ok(Self::new_with_num_of_bytes(num_bits / 8))
-}
-
-/// Create a new [Sbbf] with given number of bytes, the exact number of
bytes will be adjusted
-/// to the next power of two bounded by [BITSET_MIN_LENGTH] and
[BITSET_MAX_LENGTH].
-pub(crate) fn new_with_num_of_bytes(num_bytes: usize) -> Self {
-let num_bytes = optimal_num_of_bytes(num_bytes);
-assert_eq!(num_bytes % size_of::(), 0);
-let num_blocks = num_bytes / size_of::();
-let bitset = vec![Block::ZERO; num_blocks];
-Self(bitset)
-}
-
-pub(crate) fn new(bitset: &[u8]) -> Self {
+/// Create a new [Sbbf] from raw bitset bytes.
Review Comment:
Thanks! that makes sense. I just added a doc example, let me know if you
think anything else needs to be added/changed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Make Parquet SBBF serialize/deserialize helpers public for external reuse [arrow-rs]
alamb commented on code in PR #8762:
URL: https://github.com/apache/arrow-rs/pull/8762#discussion_r2482705647
##
parquet/src/bloom_filter/mod.rs:
##
@@ -244,29 +244,8 @@ fn num_of_bits_from_ndv_fpp(ndv: u64, fpp: f64) -> usize {
}
impl Sbbf {
-/// Create a new [Sbbf] with given number of distinct values and false
positive probability.
-/// Will return an error if `fpp` is greater than or equal to 1.0 or less
than 0.0.
-pub(crate) fn new_with_ndv_fpp(ndv: u64, fpp: f64) -> Result {
-if !(0.0..1.0).contains(&fpp) {
-return Err(ParquetError::General(format!(
-"False positive probability must be between 0.0 and 1.0, got
{fpp}"
-)));
-}
-let num_bits = num_of_bits_from_ndv_fpp(ndv, fpp);
-Ok(Self::new_with_num_of_bytes(num_bits / 8))
-}
-
-/// Create a new [Sbbf] with given number of bytes, the exact number of
bytes will be adjusted
-/// to the next power of two bounded by [BITSET_MIN_LENGTH] and
[BITSET_MAX_LENGTH].
-pub(crate) fn new_with_num_of_bytes(num_bytes: usize) -> Self {
-let num_bytes = optimal_num_of_bytes(num_bytes);
-assert_eq!(num_bytes % size_of::(), 0);
-let num_blocks = num_bytes / size_of::();
-let bitset = vec![Block::ZERO; num_blocks];
-Self(bitset)
-}
-
-pub(crate) fn new(bitset: &[u8]) -> Self {
+/// Create a new [Sbbf] from raw bitset bytes.
Review Comment:
Can you please add a doc example here showing how to use this API with your
intended usecase?
That will both help the documentation and ensure we have exposed enough of
the API to be useful
For example, if the idea is to save a SBFF using `write` and then re-create
it again, an exmaple that:
1. made a Sbbf (ideally read it from a file)
2. wrote it to a `Vec<>`
3. Created a new Sbbf from that vec
4. Show that it is the same as the original
it would be nice to mention that a correct argument for creating a sbff can
be created using `Sbff::write`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
