lukecwik commented on a change in pull request #12089:
URL: https://github.com/apache/beam/pull/12089#discussion_r450541065
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
##########
@@ -410,13 +417,22 @@ public Read withKeyRange(ByteKeyRange keyRange) {
*
* <p>Does not modify this object.
*/
- public Read withKeyRanges(List<ByteKeyRange> keyRanges) {
+ public Read withKeyRanges(ValueProvider<List<ByteKeyRange>> keyRanges) {
checkArgument(keyRanges != null, "keyRanges can not be null");
- checkArgument(!keyRanges.isEmpty(), "keyRanges can not be empty");
- for (ByteKeyRange range : keyRanges) {
- checkArgument(range != null, "keyRanges cannot hold null range");
- }
- return toBuilder().setKeyRanges(keyRanges).build();
+ BigtableReadOptions bigtableReadOptions = getBigtableReadOptions();
+ return toBuilder()
+
.setBigtableReadOptions(bigtableReadOptions.toBuilder().setKeyRanges(keyRanges).build())
+ .build();
+ }
+
+ /**
+ * Returns a new {@link BigtableIO.Read} that will read only rows in the
specified ranges.
+ * Ranges must not overlap.
Review comment:
nit: currently it seems like overlapping ranges lead to duplicate reads
for any keys in the overlapping area, it would be good if either:
1) overlapping ranges lead to an error
2) overlapping ranges are correctly fixed up to not produce duplicates
##########
File path: CHANGES.md
##########
@@ -63,6 +63,8 @@
is experimental. It reads data from BigQuery by exporting data to Avro
files, and reading those files. It also supports
reading data by exporting to JSON files. This has small differences in
behavior for Time and Date-related fields. See
Pydoc for more information.
+* New overloads for BigtableIO.Read.withKeyRange() and
BigtableIO.Read.withRowFilter()
Review comment:
This will need to go into 2.24 as 2.23 release branch has been cut. Just
add a section like 2.23 with empty sections.
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableReadOptions.java
##########
@@ -0,0 +1,73 @@
+package org.apache.beam.sdk.io.gcp.bigtable;
+
+import static
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.value.AutoValue;
+import com.google.bigtable.v2.RowFilter;
+import java.io.Serializable;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.io.range.ByteKeyRange;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+
+/** Configuration for which values to read from Bigtable. */
+@AutoValue
+abstract class BigtableReadOptions implements Serializable {
+
+ /** Returns the Row filter to use. */
Review comment:
```suggestion
/** Returns the row filter to use. */
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]