[jira] [Assigned] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.
[ https://issues.apache.org/jira/browse/BEAM-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4543: Assignee: Udi Meiri (was: Valentyn Tymofieiev) > Remove dependency on googledatastore in favor of google-cloud-datastore. > > > Key: BEAM-4543 > URL: https://issues.apache.org/jira/browse/BEAM-4543 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Udi Meiri >Priority: Minor > > apache-beam[gcp] package depends [1] on googledatastore package [2]. We > should replace this dependency with google-cloud-datastore [3] which is > officially supported, has better release cadence and also has Python 3 > support. > [1] > https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126 > [2] [https://pypi.org/project/googledatastore/] > [3] [https://pypi.org/project/google-cloud-datastore/] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly
[ https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644101#comment-16644101 ] Chamikara Jayalath edited comment on BEAM-5514 at 10/9/18 9:35 PM: --- Thanks. I believe HTTP 403 issues in general are considered non-retriable. So it makes sense for Dataflow to not to retry requests at the client. In-fact BigQuery support page provides following instructions regarding HTTP 403 quotaExceeded errors. https://cloud.google.com/bigquery/troubleshooting-errors "View the {{message}} property of the error object for more information about which quota was exceeded. To reset or raise a BigQuery quota, [contact support|https://cloud.google.com/support]. To modify a custom quota, submit a request from the [Google Cloud Platform Console|https://console.cloud.google.com/iam-admin/quotas] page."; So basically this is asking to fix the issue (request a quota increase) before retrying. The issues is, due to the architecture if Dataflow streaming jobs, even though we do not retry at the client, we do in fact retry all work items indefinitely. So we end up sending a large number of requests to BigQuery whenever a user hit quota errors. was (Author: chamikara): Kevin, Sounds like > BigQueryIO doesn't handle quotaExceeded errors properly > --- > > Key: BEAM-5514 > URL: https://issues.apache.org/jira/browse/BEAM-5514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Peterson >Assignee: Reuven Lax >Priority: Major > > When exceeding a streaming quota for BigQuery insertAll requests, BigQuery > returns a 403 with reason "quotaExceeded". > The current implementation of BigQueryIO does not consider this to be a rate > limited exception, and therefore does not perform exponential backoff > properly, leading to repeated calls to BQ. > The actual error is in the > [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739] > class, which is called from > [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263] > to determine how to retry the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly
[ https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644101#comment-16644101 ] Chamikara Jayalath commented on BEAM-5514: -- Kevin, Sounds like > BigQueryIO doesn't handle quotaExceeded errors properly > --- > > Key: BEAM-5514 > URL: https://issues.apache.org/jira/browse/BEAM-5514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Peterson >Assignee: Reuven Lax >Priority: Major > > When exceeding a streaming quota for BigQuery insertAll requests, BigQuery > returns a 403 with reason "quotaExceeded". > The current implementation of BigQueryIO does not consider this to be a rate > limited exception, and therefore does not perform exponential backoff > properly, leading to repeated calls to BQ. > The actual error is in the > [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739] > class, which is called from > [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263] > to determine how to retry the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5456) Update google-api-client libraries to 1.25
[ https://issues.apache.org/jira/browse/BEAM-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-5456: - Fix Version/s: (was: 2.8.0) 2.9.0 > Update google-api-client libraries to 1.25 > -- > > Key: BEAM-5456 > URL: https://issues.apache.org/jira/browse/BEAM-5456 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Blocker > Fix For: 2.9.0 > > > This version updates authentication URLs > ([https://github.com/googleapis/google-api-java-client/releases)] that is > needed for certain features. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5456) Update google-api-client libraries to 1.25
[ https://issues.apache.org/jira/browse/BEAM-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643879#comment-16643879 ] Chamikara Jayalath commented on BEAM-5456: -- Moved to 2.9.0. > Update google-api-client libraries to 1.25 > -- > > Key: BEAM-5456 > URL: https://issues.apache.org/jira/browse/BEAM-5456 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Blocker > Fix For: 2.9.0 > > > This version updates authentication URLs > ([https://github.com/googleapis/google-api-java-client/releases)] that is > needed for certain features. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly
[ https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643834#comment-16643834 ] Chamikara Jayalath edited comment on BEAM-5514 at 10/9/18 6:06 PM: --- I'm trying to determine the priority at which this should be addressed. [~reuvenlax] any reason why we rely on workitems retries instead of retrying BQ streaming write requests with exponential backoff ? was (Author: chamikara): I'm trying to determine the priority at which this should be addressed. [~reuvenlax] any reason why do rely on workitems retries instead of retrying BQ streaming write requests with exponential backoff ? > BigQueryIO doesn't handle quotaExceeded errors properly > --- > > Key: BEAM-5514 > URL: https://issues.apache.org/jira/browse/BEAM-5514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Peterson >Assignee: Reuven Lax >Priority: Major > > When exceeding a streaming quota for BigQuery insertAll requests, BigQuery > returns a 403 with reason "quotaExceeded". > The current implementation of BigQueryIO does not consider this to be a rate > limited exception, and therefore does not perform exponential backoff > properly, leading to repeated calls to BQ. > The actual error is in the > [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739] > class, which is called from > [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263] > to determine how to retry the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly
[ https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5514: Assignee: Reuven Lax (was: Chamikara Jayalath) > BigQueryIO doesn't handle quotaExceeded errors properly > --- > > Key: BEAM-5514 > URL: https://issues.apache.org/jira/browse/BEAM-5514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Peterson >Assignee: Reuven Lax >Priority: Major > > When exceeding a streaming quota for BigQuery insertAll requests, BigQuery > returns a 403 with reason "quotaExceeded". > The current implementation of BigQueryIO does not consider this to be a rate > limited exception, and therefore does not perform exponential backoff > properly, leading to repeated calls to BQ. > The actual error is in the > [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739] > class, which is called from > [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263] > to determine how to retry the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly
[ https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643834#comment-16643834 ] Chamikara Jayalath commented on BEAM-5514: -- I'm trying to determine the priority at which this should be addressed. [~reuvenlax] any reason why do rely on workitems retries instead of retrying BQ streaming write requests with exponential backoff ? > BigQueryIO doesn't handle quotaExceeded errors properly > --- > > Key: BEAM-5514 > URL: https://issues.apache.org/jira/browse/BEAM-5514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Peterson >Assignee: Chamikara Jayalath >Priority: Major > > When exceeding a streaming quota for BigQuery insertAll requests, BigQuery > returns a 403 with reason "quotaExceeded". > The current implementation of BigQueryIO does not consider this to be a rate > limited exception, and therefore does not perform exponential backoff > properly, leading to repeated calls to BQ. > The actual error is in the > [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739] > class, which is called from > [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263] > to determine how to retry the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5670) Add more integration tests for BigQueryIO
Chamikara Jayalath created BEAM-5670: Summary: Add more integration tests for BigQueryIO Key: BEAM-5670 URL: https://issues.apache.org/jira/browse/BEAM-5670 Project: Beam Issue Type: Test Components: io-java-gcp, testing Reporter: Chamikara Jayalath Assignee: Pablo Estrada Seems like we currently only have a single test that directly read using a query. [https://github.com/apache/beam/blob/328129bf033bc6be16bc8e09af905f37b7516412/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java] We should consider adding more integration tests. For example, (1) Read directly from a given table and a dataset. (2) Read from a federated table. (3) Read using BQ legacy SQL. (4) Read from a table with nested/repeated fields. (5) Read from non-standard BQ regions (for example, Japan). Also, we should consider adding tests for BQ streaming writes once we have framework support for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()
[ https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638584#comment-16638584 ] Chamikara Jayalath commented on BEAM-5036: -- Should this be marked as a blocker for 2.8.0 ? PR is still in review. > Optimize FileBasedSink's WriteOperation.moveToOutput() > -- > > Key: BEAM-5036 > URL: https://issues.apache.org/jira/browse/BEAM-5036 > Project: Beam > Issue Type: Improvement > Components: io-java-files >Affects Versions: 2.5.0 >Reporter: Jozef Vilcek >Assignee: Tim Robertson >Priority: Major > Fix For: 2.8.0 > > Time Spent: 9h 40m > Remaining Estimate: 0h > > moveToOutput() methods in FileBasedSink.WriteOperation implements move by > copy+delete. It would be better to use a rename() which can be much more > effective for some filesystems. > Filesystem must support cross-directory rename. BEAM-4861 is related to this > for the case of HDFS filesystem. > Feature was discussed here: > http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5342) Migrate google-api-client libraries to 1.24.1
[ https://issues.apache.org/jira/browse/BEAM-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-5342. -- Resolution: Fixed Fix Version/s: 2.8.0 > Migrate google-api-client libraries to 1.24.1 > - > > Key: BEAM-5342 > URL: https://issues.apache.org/jira/browse/BEAM-5342 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp, runner-dataflow >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Major > Fix For: 2.8.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > We currently use 1.23 libraries which is about an year old. We should migrate > to more recent 1.24.1 which fixes several known issues. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5517) Update Python BigQuery source to use fastavro module to read exported data
Chamikara Jayalath created BEAM-5517: Summary: Update Python BigQuery source to use fastavro module to read exported data Key: BEAM-5517 URL: https://issues.apache.org/jira/browse/BEAM-5517 Project: Beam Issue Type: Improvement Components: sdk-py-core Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Currently we use avro module for reading data exported by BigQuery. Moving to fastavro should result in a significant performance boost. Creating this Jira for tracking but most of the changes should be in the runner (DataflowRunner) since we current BigQuery source is a native source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5408) (Java) Using Compression.GZIP with TFRecordIO
[ https://issues.apache.org/jira/browse/BEAM-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-5408. -- Resolution: Fixed Fix Version/s: 2.8.0 > (Java) Using Compression.GZIP with TFRecordIO > - > > Key: BEAM-5408 > URL: https://issues.apache.org/jira/browse/BEAM-5408 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: haden lee >Assignee: Chamikara Jayalath >Priority: Major > Fix For: 2.8.0 > > > In short, `TFRecrdIO.read()` does not seem to work if the entry being read is > longer than 8,192 (in terms of byte[] length). `TFRecordIO.write()` seems to > be OK with this though (based on some experiments). Perhaps there is some > hard-coded value for this specific length somewhere in the SDK, and I'm > wondering if it can be increased or parameterized. > [I've posted this on > StackOverflow|https://stackoverflow.com/questions/52284639/beam-java-sdk-with-tfrecord-and-compression-gzip], > but I was advised to report it here. > Here are the details: > We're using Beam Java SDK (and Google Cloud Dataflow to run batch jobs) a > lot, and we noticed something weird (possibly a bug?) when we tried to use > `TFRecordIO` with `Compression.GZIP`. We were able to come up with some > sample code that can reproduce the errors we face. > To be clear, we are using Beam Java SDK 2.4. > Suppose we have `PCollection` which can be a PC of proto messages, > for instance, in byte[] format. > We usually write this to GCS (Google Cloud Storage) using Base64 encoding > (newline delimited Strings) or using TFRecordIO (without compression). We > have had no issue reading the data from GCS in this manner for a very long > time (2.5+ years for the former and ~1.5 years for the latter). > Recently, we tried `TFRecordIO` with `Compression.GZIP` option, and > *sometimes* we get an exception as the data is seen as invalid (while being > read). The data itself (the gzip files) is not corrupted, and we've tested > various things, and reached the following conclusion. > When a `byte[]` that is being compressed under `TFRecordIO` is above certain > threshold (I'd say when at or above 8192), then > `TFRecordIO.read().withCompression(Compression.GZIP)` would not work. > Specifically, it will throw the following exception: > > {code:java} > // code placeholder > Exception in thread "main" java.lang.IllegalStateException: Invalid data > at > org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444) > at org.apache.beam.sdk.io.TFRecordIO$TFRecordCodec.read(TFRecordIO.java:642) > at > org.apache.beam.sdk.io.TFRecordIO$TFRecordSource$TFRecordReader.readNextRecord(TFRecordIO.java:526) > at > org.apache.beam.sdk.io.CompressedSource$CompressedReader.readNextRecord(CompressedSource.java:426) > at > org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:473) > at > org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:468) > at > org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:261) > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:141) > at > org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:161) > at > org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:125) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > This can be reproduced easily, so you can refer to the code at the end. You > will also see comments about the byte array length (as I tested with various > sizes, I concluded that 8192 is the magic number). > So I'm wondering if this is a bug or known issue – I couldn't find anything > close to this on Apache Beam's Issue Tracker [here][1] but if there is > another forum/site I need to check, please let me know! > If this is indeed a bug, what would be the right channel to report this? > — > The following code can reproduce the error we have. > A successful run (with parameters 1, 39, 100) would show the following > message at the end: > {code:java} > // code placeholder > counter metrics from CountDoFn > [counter] plain_base64_proto_array_len: 8126 > [counter] plain_base64_proto_in: 1 > [counter] plain_base64_proto_val_cnt: 39 > [counter] tfrecord_gz_proto_
[jira] [Resolved] (BEAM-5412) TFRecordIO fails with records larger than 8K
[ https://issues.apache.org/jira/browse/BEAM-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-5412. -- Resolution: Fixed Fix Version/s: 2.8.0 > TFRecordIO fails with records larger than 8K > > > Key: BEAM-5412 > URL: https://issues.apache.org/jira/browse/BEAM-5412 > Project: Beam > Issue Type: Bug > Components: io-java-text >Affects Versions: 2.4.0 >Reporter: Raghu Angadi >Assignee: Chamikara Jayalath >Priority: Major > Fix For: 2.8.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > This was reported on > [Stackoverflow|https://stackoverflow.com/questions/52284639/beam-java-sdk-with-tfrecord-and-compression-gzip]. > TFRecordIO reader assumes a single call to {{channel.read()}} returns as > much as can fit in the input buffer. {{read()}} can return fewer bytes than > requested. Assert failure : > https://github.com/apache/beam/blob/release-2.4.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L642 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5456) Update google-api-client libraries to 1.25
Chamikara Jayalath created BEAM-5456: Summary: Update google-api-client libraries to 1.25 Key: BEAM-5456 URL: https://issues.apache.org/jira/browse/BEAM-5456 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Fix For: 2.8.0 This version updates authentication URLs ([https://github.com/googleapis/google-api-java-client/releases)] that is needed for certain features. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5445) Update SpannerIO to support unbounded writes
Chamikara Jayalath created BEAM-5445: Summary: Update SpannerIO to support unbounded writes Key: BEAM-5445 URL: https://issues.apache.org/jira/browse/BEAM-5445 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Currently, due to a known issue, streaming pipelines that use SpannerIO.Write do not actually write to Spanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"
[ https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-5432. -- Resolution: Fixed Fix Version/s: Not applicable > beam-runners-direct-java fails to build due to "cannot find symbol ... > symbol: method create(JobInfo)" > > > Key: BEAM-5432 > URL: https://issues.apache.org/jira/browse/BEAM-5432 > Project: Beam > Issue Type: Bug > Components: runner-direct >Reporter: Chamikara Jayalath >Assignee: Daniel Oliveira >Priority: Major > Fix For: Not applicable > > > Seems to be due to > [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151] > ./gradlew :beam-runners-direct-java:build passes without about PR but fails > with following error with it. > > > Task :beam-runners-direct-java:compileJava FAILED > > /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: > error: cannot find symbol > return DockerJobBundleFactory.create(jobInfo); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"
[ https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622402#comment-16622402 ] Chamikara Jayalath commented on BEAM-5432: -- Thanks. Closing this. > beam-runners-direct-java fails to build due to "cannot find symbol ... > symbol: method create(JobInfo)" > > > Key: BEAM-5432 > URL: https://issues.apache.org/jira/browse/BEAM-5432 > Project: Beam > Issue Type: Bug > Components: runner-direct >Reporter: Chamikara Jayalath >Assignee: Daniel Oliveira >Priority: Major > Fix For: Not applicable > > > Seems to be due to > [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151] > ./gradlew :beam-runners-direct-java:build passes without about PR but fails > with following error with it. > > > Task :beam-runners-direct-java:compileJava FAILED > > /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: > error: cannot find symbol > return DockerJobBundleFactory.create(jobInfo); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"
[ https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-5432: - Description: Seems to be due to [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151] ./gradlew :beam-runners-direct-java:build passes without about PR but fails with following error with it. > Task :beam-runners-direct-java:compileJava FAILED /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: error: cannot find symbol return DockerJobBundleFactory.create(jobInfo); was: Seems to be due to [https://github.com/apache/beam/pull/6151.] ./gradlew :beam-runners-direct-java:build passes without about PR but fails with following error with it. > Task :beam-runners-direct-java:compileJava FAILED /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: error: cannot find symbol return DockerJobBundleFactory.create(jobInfo); > beam-runners-direct-java fails to build due to "cannot find symbol ... > symbol: method create(JobInfo)" > > > Key: BEAM-5432 > URL: https://issues.apache.org/jira/browse/BEAM-5432 > Project: Beam > Issue Type: Bug > Components: runner-direct >Reporter: Chamikara Jayalath >Assignee: Daniel Oliveira >Priority: Major > > Seems to be due to > [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151] > ./gradlew :beam-runners-direct-java:build passes without about PR but fails > with following error with it. > > > Task :beam-runners-direct-java:compileJava FAILED > > /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: > error: cannot find symbol > return DockerJobBundleFactory.create(jobInfo); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"
Chamikara Jayalath created BEAM-5432: Summary: beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)" Key: BEAM-5432 URL: https://issues.apache.org/jira/browse/BEAM-5432 Project: Beam Issue Type: Bug Components: runner-direct Reporter: Chamikara Jayalath Assignee: Daniel Oliveira Seems to be due to [https://github.com/apache/beam/pull/6151.] ./gradlew :beam-runners-direct-java:build passes without about PR but fails with following error with it. > Task :beam-runners-direct-java:compileJava FAILED /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268: error: cannot find symbol return DockerJobBundleFactory.create(jobInfo); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs
[ https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619823#comment-16619823 ] Chamikara Jayalath commented on BEAM-5426: -- In that case, how about keeping track of load jobs for different destinations, and failing the job if we detect two load jobs for the same destination ? We should find a way to actively fail for this case, since currently this ends up being a silent data loss. > Use both destination and TableDestination for BQ load job IDs > - > > Key: BEAM-5426 > URL: https://issues.apache.org/jira/browse/BEAM-5426 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Priority: Major > > Currently we use TableDestination when creating a unique load job ID for a > destination: > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359] > > This can result in a data loss issue if a user returns the same > TableDestination for different destination IDs. I think we can prevent this > if we include both IDs in the BQ load job ID. > > CC: [~reuvenlax] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs
[ https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5426: Assignee: (was: Chamikara Jayalath) > Use both destination and TableDestination for BQ load job IDs > - > > Key: BEAM-5426 > URL: https://issues.apache.org/jira/browse/BEAM-5426 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Priority: Major > > Currently we use TableDestination when creating a unique load job ID for a > destination: > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359] > > This can result in a data loss issue if a user returns the same > TableDestination for different destination IDs. I think we can prevent this > if we include both IDs in the BQ load job ID. > > CC: [~reuvenlax] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs
Chamikara Jayalath created BEAM-5426: Summary: Use both destination and TableDestination for BQ load job IDs Key: BEAM-5426 URL: https://issues.apache.org/jira/browse/BEAM-5426 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Currently we use TableDestination when creating a unique load job ID for a destination: [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359] This can result in a data loss issue if a user returns the same TableDestination for different destination IDs. I think we can prevent this if we include both IDs in the BQ load job ID. CC: [~reuvenlax] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5422) Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable()
Chamikara Jayalath created BEAM-5422: Summary: Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable() Key: BEAM-5422 URL: https://issues.apache.org/jira/browse/BEAM-5422 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Currently, there are some details related to these methods that should be further clarified. For example, getTable() is expected to return a unique value for each destination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5410) Fail SpannerIO early for unsupported streaming mode
Chamikara Jayalath created BEAM-5410: Summary: Fail SpannerIO early for unsupported streaming mode Key: BEAM-5410 URL: https://issues.apache.org/jira/browse/BEAM-5410 Project: Beam Issue Type: Improvement Components: io-java-gcp Affects Versions: 2.8.0 Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Currently SpannerIO does not support streaming mode. We should fail with a clear error till this is fixed and also update documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5364) BigtableIO source tries to validate table ID even though validation is turned off
[ https://issues.apache.org/jira/browse/BEAM-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611444#comment-16611444 ] Chamikara Jayalath commented on BEAM-5364: -- Kevin, is this a regression from 2.6.0 if not this should probably not be a release blocker. Nevertheless agree that we should fix this soon. > BigtableIO source tries to validate table ID even though validation is > turned off > -- > > Key: BEAM-5364 > URL: https://issues.apache.org/jira/browse/BEAM-5364 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Si >Assignee: Chamikara Jayalath >Priority: Blocker > > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L1084|https://www.google.com/url?q=https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java%23L1084&sa=D&usg=AFQjCNEfHprTOvnwAwFSrXwUuLvc__JBWg] > The validation can be turned off with following: > BigtableIO.read() > .withoutValidation() // skip validation when constructing the > pipelline. > A Dataflow template cannot be constructed due to this validation failure. > > Error log when trying to construct a template: > Exception in thread "main" java.lang.IllegalArgumentException: tableId was > not supplied > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.validate(BigtableIO.java:1084) > at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:95) > at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:85) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:167) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:423) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:179) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182) > at > com.google.cloud.teleport.bigtable.BigtableToAvro.main(BigtableToAvro.java:89) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5342) Migrate google-api-client libraries to 1.24.1
Chamikara Jayalath created BEAM-5342: Summary: Migrate google-api-client libraries to 1.24.1 Key: BEAM-5342 URL: https://issues.apache.org/jira/browse/BEAM-5342 Project: Beam Issue Type: Improvement Components: io-java-gcp, runner-dataflow Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath We currently use 1.23 libraries which is about an year old. We should migrate to more recent 1.24.1 which fixes several known issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4417) BigqueryIO Numeric datatype Support
[ https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606518#comment-16606518 ] Chamikara Jayalath commented on BEAM-4417: -- Pablo is looking into this. > BigqueryIO Numeric datatype Support > --- > > Key: BEAM-4417 > URL: https://issues.apache.org/jira/browse/BEAM-4417 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Kishan Kumar >Assignee: Pablo Estrada >Priority: Critical > Labels: newbie, patch > Fix For: 2.8.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > The BigQueryIO.read fails while parsing the data from the avro file generated > while reading the data from the table which has columns with *Numeric* > datatypes. > We have gone through the source code at Git-Hub and noticed that *Numeric > data type is not yet supported.* > > Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: > NUMERIC > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4417) BigqueryIO Numeric datatype Support
[ https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4417: Assignee: Pablo Estrada (was: Chamikara Jayalath) > BigqueryIO Numeric datatype Support > --- > > Key: BEAM-4417 > URL: https://issues.apache.org/jira/browse/BEAM-4417 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Kishan Kumar >Assignee: Pablo Estrada >Priority: Critical > Labels: newbie, patch > Fix For: 2.8.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > The BigQueryIO.read fails while parsing the data from the avro file generated > while reading the data from the table which has columns with *Numeric* > datatypes. > We have gone through the source code at Git-Hub and noticed that *Numeric > data type is not yet supported.* > > Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: > NUMERIC > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3519) GCP IO exposes netty on its API surface, causing conflicts with runners
[ https://issues.apache.org/jira/browse/BEAM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-3519: Assignee: Ismaël Mejía (was: Chamikara Jayalath) > GCP IO exposes netty on its API surface, causing conflicts with runners > --- > > Key: BEAM-3519 > URL: https://issues.apache.org/jira/browse/BEAM-3519 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Time Spent: 4h 40m > Remaining Estimate: 0h > > Google Cloud Platform IOs module leaks netty this causes conflicts in > particular with execution systems that use conflicting versions of such > modules. > For the case there is a dependency conflict with the Spark Runner version of > netty, see: BEAM-3492 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3519) GCP IO exposes netty on its API surface, causing conflicts with runners
[ https://issues.apache.org/jira/browse/BEAM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603172#comment-16603172 ] Chamikara Jayalath commented on BEAM-3519: -- Netty/gRPC/protobuf dependencies of Beam (including google-cloud-platform) were upgraded recently. So I suspect this is not an issue anymore. [~iemejia] can you confirm ? > GCP IO exposes netty on its API surface, causing conflicts with runners > --- > > Key: BEAM-3519 > URL: https://issues.apache.org/jira/browse/BEAM-3519 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Ismaël Mejía >Assignee: Chamikara Jayalath >Priority: Critical > Time Spent: 4h 40m > Remaining Estimate: 0h > > Google Cloud Platform IOs module leaks netty this causes conflicts in > particular with execution systems that use conflicting versions of such > modules. > For the case there is a dependency conflict with the Spark Runner version of > netty, see: BEAM-3492 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4977) Beam Dependency Update Request: io.dropwizard.metrics:metrics-core 4.1.0-rc2
[ https://issues.apache.org/jira/browse/BEAM-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4977: Assignee: Chamikara Jayalath (was: Scott Wegner) > Beam Dependency Update Request: io.dropwizard.metrics:metrics-core 4.1.0-rc2 > > > Key: BEAM-4977 > URL: https://issues.apache.org/jira/browse/BEAM-4977 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:31:12.560286 > Please review and upgrade the io.dropwizard.metrics:metrics-core to > the latest version 4.1.0-rc2 > > cc: > 2018-08-06 12:13:49.600351 > Please review and upgrade the io.dropwizard.metrics:metrics-core to > the latest version 4.1.0-rc2 > > cc: > 2018-08-13 12:15:16.600478 > Please review and upgrade the io.dropwizard.metrics:metrics-core to > the latest version 4.1.0-rc2 > > cc: > 2018-08-20 12:15:28.768620 > Please review and upgrade the io.dropwizard.metrics:metrics-core to > the latest version 4.1.0-rc2 > > cc: > 2018-08-27 12:16:05.660353 > Please review and upgrade the io.dropwizard.metrics:metrics-core to > the latest version 4.1.0-rc2 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4952) Beam Dependency Update Request: org.apache.hbase:hbase-hadoop-compat 2.1.0
[ https://issues.apache.org/jira/browse/BEAM-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594257#comment-16594257 ] Chamikara Jayalath commented on BEAM-4952: -- Tim, a kind reminder about this and other dependency upgrade JIRAs assigned to you by the tool. Feel free to unassign if you don't have cycles to look into these. > Beam Dependency Update Request: org.apache.hbase:hbase-hadoop-compat 2.1.0 > -- > > Key: BEAM-4952 > URL: https://issues.apache.org/jira/browse/BEAM-4952 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Tim Robertson >Priority: Major > > 2018-07-25 20:28:24.987897 > Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to > the latest version 2.1.0 > > cc: > 2018-08-06 12:11:58.406173 > Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to > the latest version 2.1.0 > > cc: > 2018-08-13 12:13:31.045787 > Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to > the latest version 2.1.0 > > cc: > 2018-08-20 12:14:04.735400 > Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to > the latest version 2.1.0 > > cc: > 2018-08-27 12:15:07.483727 > Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to > the latest version 2.1.0 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5229) Beam Dependency Update Request: com.commercehub.gradle.plugin:gradle-avro-plugin 0.15.0
[ https://issues.apache.org/jira/browse/BEAM-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5229: Assignee: Chamikara Jayalath (was: Scott Wegner) > Beam Dependency Update Request: > com.commercehub.gradle.plugin:gradle-avro-plugin 0.15.0 > --- > > Key: BEAM-5229 > URL: https://issues.apache.org/jira/browse/BEAM-5229 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-08-27 12:14:32.746008 > Please review and upgrade the > com.commercehub.gradle.plugin:gradle-avro-plugin to the latest version 0.15.0 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4905) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5
[ https://issues.apache.org/jira/browse/BEAM-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594250#comment-16594250 ] Chamikara Jayalath commented on BEAM-4905: -- [https://github.com/apache/beam/pull/6281] in review. > Beam Dependency Update Request: > de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5 > - > > Key: BEAM-4905 > URL: https://issues.apache.org/jira/browse/BEAM-4905 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:23:58.022170 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: > 2018-08-06 12:09:39.047955 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: > 2018-08-13 12:09:56.445368 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: > 2018-08-20 12:12:37.728222 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: > 2018-08-27 12:14:06.408790 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5224) Beam Dependency Update Request: com.gradle:build-scan-plugin 1.16
[ https://issues.apache.org/jira/browse/BEAM-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5224: Assignee: Chamikara Jayalath (was: Scott Wegner) > Beam Dependency Update Request: com.gradle:build-scan-plugin 1.16 > - > > Key: BEAM-5224 > URL: https://issues.apache.org/jira/browse/BEAM-5224 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-08-27 12:13:32.215540 > Please review and upgrade the com.gradle:build-scan-plugin to the > latest version 1.16 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4874) Beam Dependency Update Request: com.google.auto.service:auto-service 1.0-rc4
[ https://issues.apache.org/jira/browse/BEAM-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4874: Assignee: Chamikara Jayalath (was: Scott Wegner) > Beam Dependency Update Request: com.google.auto.service:auto-service 1.0-rc4 > > > Key: BEAM-4874 > URL: https://issues.apache.org/jira/browse/BEAM-4874 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:21:08.597317 > Please review and upgrade the com.google.auto.service:auto-service to > the latest version 1.0-rc4 > > cc: > 2018-08-06 12:08:14.189802 > Please review and upgrade the com.google.auto.service:auto-service to > the latest version 1.0-rc4 > > cc: > 2018-08-13 12:08:37.495569 > Please review and upgrade the com.google.auto.service:auto-service to > the latest version 1.0-rc4 > > cc: > 2018-08-20 12:11:49.688170 > Please review and upgrade the com.google.auto.service:auto-service to > the latest version 1.0-rc4 > > cc: > 2018-08-27 12:13:11.448132 > Please review and upgrade the com.google.auto.service:auto-service to > the latest version 1.0-rc4 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5182) Beam Dependency Update Request: org.assertj:assertj-core 3.11.0
[ https://issues.apache.org/jira/browse/BEAM-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5182: Assignee: Chamikara Jayalath (was: Scott Wegner) > Beam Dependency Update Request: org.assertj:assertj-core 3.11.0 > --- > > Key: BEAM-5182 > URL: https://issues.apache.org/jira/browse/BEAM-5182 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-08-20 12:11:47.500445 > Please review and upgrade the org.assertj:assertj-core to the latest > version 3.11.0 > > cc: > 2018-08-27 12:13:06.123086 > Please review and upgrade the org.assertj:assertj-core to the latest > version 3.11.0 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (BEAM-4982) Beam Dependency Update Request: io.netty:netty-transport-native-epoll 5.0.0.Alpha2
[ https://issues.apache.org/jira/browse/BEAM-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath closed BEAM-4982. Resolution: Won't Fix Fix Version/s: Not applicable > Beam Dependency Update Request: io.netty:netty-transport-native-epoll > 5.0.0.Alpha2 > -- > > Key: BEAM-4982 > URL: https://issues.apache.org/jira/browse/BEAM-4982 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > > 2018-07-25 20:31:42.182471 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-06 12:14:13.141909 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-13 12:15:42.691508 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-20 12:15:45.103065 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-27 12:16:18.187792 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4982) Beam Dependency Update Request: io.netty:netty-transport-native-epoll 5.0.0.Alpha2
[ https://issues.apache.org/jira/browse/BEAM-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594052#comment-16594052 ] Chamikara Jayalath commented on BEAM-4982: -- I don't this can be upgraded independently. We have to depend on a matching version of this dependency when we upgrade netty/gRPC. > Beam Dependency Update Request: io.netty:netty-transport-native-epoll > 5.0.0.Alpha2 > -- > > Key: BEAM-4982 > URL: https://issues.apache.org/jira/browse/BEAM-4982 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:31:42.182471 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-06 12:14:13.141909 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-13 12:15:42.691508 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-20 12:15:45.103065 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: > 2018-08-27 12:16:18.187792 > Please review and upgrade the io.netty:netty-transport-native-epoll > to the latest version 5.0.0.Alpha2 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5216) BigQueryIO multi-partitioned write doesn't work for streaming writes
Chamikara Jayalath created BEAM-5216: Summary: BigQueryIO multi-partitioned write doesn't work for streaming writes Key: BEAM-5216 URL: https://issues.apache.org/jira/browse/BEAM-5216 Project: Beam Issue Type: Bug Components: io-java-gcp Reporter: Chamikara Jayalath BigQueryIO performes multi-partitioned write (MultiPartitionsWriteTables step) when there's more data than the quota allowed by BigQuery (10k files or 11TB of data) to be written to a single BQ table. When writing using load jobs in streaming mode (with a triggering frequency) we hit following location where we set CREATE_DISPOSITION to CREATE_NEVER for all panes other than the first one. This is fine when we are writing a single partition (all panes of a window should write to the same table) but when there are multiple partitions this is incorrect since we need to create temp tables for all panes. [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L165] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589895#comment-16589895 ] Chamikara Jayalath commented on BEAM-5148: -- Checkout existing source/sink tests (for example, textio_test) for the set of tests that you should consider developing. > Implement MongoDB IO for Python SDK > --- > > Key: BEAM-5148 > URL: https://issues.apache.org/jira/browse/BEAM-5148 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Affects Versions: 3.0.0 >Reporter: Pascal Gula >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > > Currently Java SDK has MongoDB support but Python SDK does not. With current > portability efforts other runners may soon be able to use Python SDK. Having > mongoDB support will allow these runners to execute large scale jobs using it. > Since we need this IO components @ Peat, we started working on a PyPi package > available at this repository: [https://github.com/PEAT-AI/beam-extended] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (BEAM-5148) Implement MongoDB IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-5148: - Comment: was deleted (was: Checkout existing source/sink tests (for example, textio_test) for the set of tests that you should consider developing.) > Implement MongoDB IO for Python SDK > --- > > Key: BEAM-5148 > URL: https://issues.apache.org/jira/browse/BEAM-5148 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Affects Versions: 3.0.0 >Reporter: Pascal Gula >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > > Currently Java SDK has MongoDB support but Python SDK does not. With current > portability efforts other runners may soon be able to use Python SDK. Having > mongoDB support will allow these runners to execute large scale jobs using it. > Since we need this IO components @ Peat, we started working on a PyPi package > available at this repository: [https://github.com/PEAT-AI/beam-extended] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589894#comment-16589894 ] Chamikara Jayalath commented on BEAM-5148: -- Checkout existing source/sink tests (for example, textio_test) for the set of tests that you should consider developing. > Implement MongoDB IO for Python SDK > --- > > Key: BEAM-5148 > URL: https://issues.apache.org/jira/browse/BEAM-5148 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Affects Versions: 3.0.0 >Reporter: Pascal Gula >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > > Currently Java SDK has MongoDB support but Python SDK does not. With current > portability efforts other runners may soon be able to use Python SDK. Having > mongoDB support will allow these runners to execute large scale jobs using it. > Since we need this IO components @ Peat, we started working on a PyPi package > available at this repository: [https://github.com/PEAT-AI/beam-extended] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589202#comment-16589202 ] Chamikara Jayalath commented on BEAM-5148: -- Thanks. Looks good in general. Could you send this for review in the form of a Beam pull request so that I can provide detailed comments. Also, I tried to assign this Jira to you but seems like you currently don't have the Beam contributor role assigned to your Jira account. Could you send a request through dev list or Slack for this ? (a PMC member can add you). > Implement MongoDB IO for Python SDK > --- > > Key: BEAM-5148 > URL: https://issues.apache.org/jira/browse/BEAM-5148 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Affects Versions: 3.0.0 >Reporter: Pascal Gula >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > > Currently Java SDK has MongoDB support but Python SDK does not. With current > portability efforts other runners may soon be able to use Python SDK. Having > mongoDB support will allow these runners to execute large scale jobs using it. > Since we need this IO components @ Peat, we started working on a PyPi package > available at this repository: [https://github.com/PEAT-AI/beam-extended] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
[ https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588028#comment-16588028 ] Chamikara Jayalath commented on BEAM-4884: -- Passing to Sergey who wrote TikaIO which is the only user of this dependency. > Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441 > -- > > Key: BEAM-4884 > URL: https://issues.apache.org/jira/browse/BEAM-4884 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Sergey Beryozkin >Priority: Major > > 2018-07-25 20:22:10.667692 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-02 11:42:45.184687 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > cc: > 2018-08-06 12:08:44.804597 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-13 12:09:05.718866 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-20 12:11:56.128242 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
[ https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4884: Assignee: Sergey Beryozkin (was: Chamikara Jayalath) > Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441 > -- > > Key: BEAM-4884 > URL: https://issues.apache.org/jira/browse/BEAM-4884 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Sergey Beryozkin >Priority: Major > > 2018-07-25 20:22:10.667692 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-02 11:42:45.184687 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > cc: > 2018-08-06 12:08:44.804597 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-13 12:09:05.718866 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-20 12:11:56.128242 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
[ https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588025#comment-16588025 ] Chamikara Jayalath commented on BEAM-5087: -- Passing to Tim who wrote Kudu IO. > Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1 > - > > Key: BEAM-5087 > URL: https://issues.apache.org/jira/browse/BEAM-5087 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Tim Robertson >Priority: Major > > 2018-08-06 12:13:44.769883 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: > 2018-08-13 12:15:08.713667 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: > 2018-08-20 12:15:23.382955 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
[ https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5087: Assignee: Tim Robertson (was: Chamikara Jayalath) > Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1 > - > > Key: BEAM-5087 > URL: https://issues.apache.org/jira/browse/BEAM-5087 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Tim Robertson >Priority: Major > > 2018-08-06 12:13:44.769883 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: > 2018-08-13 12:15:08.713667 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: > 2018-08-20 12:15:23.382955 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4922) Beam Dependency Update Request: org.freemarker:freemarker 2.3.28
[ https://issues.apache.org/jira/browse/BEAM-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-4922. -- Resolution: Fixed Fix Version/s: 2.7.0 > Beam Dependency Update Request: org.freemarker:freemarker 2.3.28 > > > Key: BEAM-4922 > URL: https://issues.apache.org/jira/browse/BEAM-4922 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > Fix For: 2.7.0 > > Time Spent: 50m > Remaining Estimate: 0h > > 2018-07-25 20:25:32.011218 > Please review and upgrade the org.freemarker:freemarker to the latest > version 2.3.28 > > cc: > 2018-08-06 12:10:28.860355 > Please review and upgrade the org.freemarker:freemarker to the latest > version 2.3.28 > > cc: > 2018-08-13 12:11:55.293042 > Please review and upgrade the org.freemarker:freemarker to the latest > version 2.3.28 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
[ https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581855#comment-16581855 ] Chamikara Jayalath commented on BEAM-4999: -- I'm getting following exception when trying to build SolrIO with this dependency upgrade. java.lang.NoClassDefFoundError: com/carrotsearch/randomizedtesting/generators/RandomInts at __randomizedtesting.SeedInfo.seed([7F6D2BEC126F6560]:0) at org.apache.lucene.util.TestUtil.nextInt(TestUtil.java:409) at org.apache.lucene.index.RandomCodec.(RandomCodec.java:120) at org.apache.lucene.util.TestRuleSetupAndRestoreClassEnv.before(TestRuleSetupAndRestoreClassEnv.java:189) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:44) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: com.carrotsearch.randomizedtesting.generators.RandomInts at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 18 more Cao, will you be able to look into this since you added SolrIO which uses this dependency ? > Beam Dependency Update Request: > com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3 > - > > Key: BEAM-4999 > URL: https://issues.apache.org/jira/browse/BEAM-4999 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:33:34.278692 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: > 2018-08-06 12:15:09.509698 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: > 2018-08-13 12:16:46.379403 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
[ https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4999: Assignee: Cao Manh Dat (was: Chamikara Jayalath) > Beam Dependency Update Request: > com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3 > - > > Key: BEAM-4999 > URL: https://issues.apache.org/jira/browse/BEAM-4999 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Cao Manh Dat >Priority: Major > > 2018-07-25 20:33:34.278692 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: > 2018-08-06 12:15:09.509698 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: > 2018-08-13 12:16:46.379403 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution
[ https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5105: Assignee: (was: Reuven Lax) > Move load job poll to finishBundle() method to better parallelize execution > --- > > Key: BEAM-5105 > URL: https://issues.apache.org/jira/browse/BEAM-5105 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Priority: Major > > It appears that when we write to BigQuery using WriteTablesDoFn we start a > load job and wait for that job to finish. > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318] > > In cases where we are trying to write a PCollection of tables (for example, > when user use dynamic destinations feature) this relies on dynamic work > rebalancing to parallellize execution of load jobs. If the runner does not > support dynamic work rebalancing or does not execute dynamic work rebalancing > from some reason this could have significant performance drawbacks. For > example, scheduling times for load jobs will add up. > > A better approach might be to start load jobs at process() method but wait > for all load jobs to finish at finishBundle() method. This will parallelize > any overheads as well as job execution (assuming more than one job is > schedule by BQ.). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution
[ https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573562#comment-16573562 ] Chamikara Jayalath commented on BEAM-5105: -- Thanks. Unassigning from you. > Move load job poll to finishBundle() method to better parallelize execution > --- > > Key: BEAM-5105 > URL: https://issues.apache.org/jira/browse/BEAM-5105 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Reuven Lax >Priority: Major > > It appears that when we write to BigQuery using WriteTablesDoFn we start a > load job and wait for that job to finish. > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318] > > In cases where we are trying to write a PCollection of tables (for example, > when user use dynamic destinations feature) this relies on dynamic work > rebalancing to parallellize execution of load jobs. If the runner does not > support dynamic work rebalancing or does not execute dynamic work rebalancing > from some reason this could have significant performance drawbacks. For > example, scheduling times for load jobs will add up. > > A better approach might be to start load jobs at process() method but wait > for all load jobs to finish at finishBundle() method. This will parallelize > any overheads as well as job execution (assuming more than one job is > schedule by BQ.). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution
[ https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572535#comment-16572535 ] Chamikara Jayalath commented on BEAM-5105: -- Reuven, I might be missing drawbacks of this approach. Could you comment ? > Move load job poll to finishBundle() method to better parallelize execution > --- > > Key: BEAM-5105 > URL: https://issues.apache.org/jira/browse/BEAM-5105 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Priority: Major > > It appears that when we write to BigQuery using WriteTablesDoFn we start a > load job and wait for that job to finish. > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318] > > In cases where we are trying to write a PCollection of tables (for example, > when user use dynamic destinations feature) this relies on dynamic work > rebalancing to parallellize execution of load jobs. If the runner does not > support dynamic work rebalancing or does not execute dynamic work rebalancing > from some reason this could have significant performance drawbacks. For > example, scheduling times for load jobs will add up. > > A better approach might be to start load jobs at process() method but wait > for all load jobs to finish at finishBundle() method. This will parallelize > any overheads as well as job execution (assuming more than one job is > schedule by BQ.). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution
Chamikara Jayalath created BEAM-5105: Summary: Move load job poll to finishBundle() method to better parallelize execution Key: BEAM-5105 URL: https://issues.apache.org/jira/browse/BEAM-5105 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath It appears that when we write to BigQuery using WriteTablesDoFn we start a load job and wait for that job to finish. [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318] In cases where we are trying to write a PCollection of tables (for example, when user use dynamic destinations feature) this relies on dynamic work rebalancing to parallellize execution of load jobs. If the runner does not support dynamic work rebalancing or does not execute dynamic work rebalancing from some reason this could have significant performance drawbacks. For example, scheduling times for load jobs will add up. A better approach might be to start load jobs at process() method but wait for all load jobs to finish at finishBundle() method. This will parallelize any overheads as well as job execution (assuming more than one job is schedule by BQ.). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution
[ https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5105: Assignee: Reuven Lax > Move load job poll to finishBundle() method to better parallelize execution > --- > > Key: BEAM-5105 > URL: https://issues.apache.org/jira/browse/BEAM-5105 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Reuven Lax >Priority: Major > > It appears that when we write to BigQuery using WriteTablesDoFn we start a > load job and wait for that job to finish. > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318] > > In cases where we are trying to write a PCollection of tables (for example, > when user use dynamic destinations feature) this relies on dynamic work > rebalancing to parallellize execution of load jobs. If the runner does not > support dynamic work rebalancing or does not execute dynamic work rebalancing > from some reason this could have significant performance drawbacks. For > example, scheduling times for load jobs will add up. > > A better approach might be to start load jobs at process() method but wait > for all load jobs to finish at finishBundle() method. This will parallelize > any overheads as well as job execution (assuming more than one job is > schedule by BQ.). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
[ https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-5087: Assignee: Chamikara Jayalath > Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1 > - > > Key: BEAM-5087 > URL: https://issues.apache.org/jira/browse/BEAM-5087 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-08-06 12:13:44.769883 > Please review and upgrade the org.apache.kudu:kudu-client to the > latest version 1.7.1 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
[ https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4999: Assignee: Chamikara Jayalath > Beam Dependency Update Request: > com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3 > - > > Key: BEAM-4999 > URL: https://issues.apache.org/jira/browse/BEAM-4999 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:33:34.278692 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: > 2018-08-06 12:15:09.509698 > Please review and upgrade the > com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest > version 2.6.3 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
[ https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4884: Assignee: Chamikara Jayalath > Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441 > -- > > Key: BEAM-4884 > URL: https://issues.apache.org/jira/browse/BEAM-4884 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:22:10.667692 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: > 2018-08-02 11:42:45.184687 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > cc: > 2018-08-06 12:08:44.804597 > Please review and upgrade the biz.aQute:bndlib to the latest version > 2.0.0.20130123-133441 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4949) Beam Dependency Update Request: com.google.guava:guava 25.1-jre
[ https://issues.apache.org/jira/browse/BEAM-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4949: Assignee: Chamikara Jayalath > Beam Dependency Update Request: com.google.guava:guava 25.1-jre > --- > > Key: BEAM-4949 > URL: https://issues.apache.org/jira/browse/BEAM-4949 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:28:04.662092 > Please review and upgrade the com.google.guava:guava to the latest > version 25.1-jre > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4922) Beam Dependency Update Request: org.freemarker:freemarker 2.3.28
[ https://issues.apache.org/jira/browse/BEAM-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4922: Assignee: Chamikara Jayalath > Beam Dependency Update Request: org.freemarker:freemarker 2.3.28 > > > Key: BEAM-4922 > URL: https://issues.apache.org/jira/browse/BEAM-4922 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:25:32.011218 > Please review and upgrade the org.freemarker:freemarker to the latest > version 2.3.28 > > cc: > 2018-08-06 12:10:28.860355 > Please review and upgrade the org.freemarker:freemarker to the latest > version 2.3.28 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4904) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.mongo 2.1.1
[ https://issues.apache.org/jira/browse/BEAM-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4904: Assignee: Chamikara Jayalath > Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.mongo > 2.1.1 > --- > > Key: BEAM-4904 > URL: https://issues.apache.org/jira/browse/BEAM-4904 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:23:49.911490 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.mongo to the latest version 2.1.1 > > cc: > 2018-08-06 12:09:30.976479 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.mongo to the latest version 2.1.1 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4905) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5
[ https://issues.apache.org/jira/browse/BEAM-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4905: Assignee: Chamikara Jayalath > Beam Dependency Update Request: > de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5 > - > > Key: BEAM-4905 > URL: https://issues.apache.org/jira/browse/BEAM-4905 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Chamikara Jayalath >Priority: Major > > 2018-07-25 20:23:58.022170 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: > 2018-08-06 12:09:39.047955 > Please review and upgrade the > de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 > > cc: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error
Chamikara Jayalath created BEAM-5057: Summary: beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error Key: BEAM-5057 URL: https://issues.apache.org/jira/browse/BEAM-5057 Project: Beam Issue Type: Bug Components: testing Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console] [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console] * What went wrong: Execution failed for task ':beam-sdks-java-core:javadoc'. > Javadoc generation failed. Generated Javadoc options file (useful for > troubleshooting): > '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4685) Allow writing to time partitioned tables when CreateDisposition==CREATE_NEVER
Chamikara Jayalath created BEAM-4685: Summary: Allow writing to time partitioned tables when CreateDisposition==CREATE_NEVER Key: BEAM-4685 URL: https://issues.apache.org/jira/browse/BEAM-4685 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Writing to time partitioned table fails when CreateDisposition==CREATE_NEVER with error "Table with field based partitioning must have a schema". This seems to be due to BigQuery not setting schema for this case. [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L114] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4675) Reduce the size of pretty string of BQ load jobs
[ https://issues.apache.org/jira/browse/BEAM-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-4675. -- Resolution: Fixed > Reduce the size of pretty string of BQ load jobs > > > Key: BEAM-4675 > URL: https://issues.apache.org/jira/browse/BEAM-4675 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Major > Fix For: 2.6.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Some of the error logs that contain BQ load jobs can be extremely large and > drop the actual error message. Usually this happens due to 'schema' and/or > ''sourceUris' of the job configuration of load jobs being very large. I think > these properties are not that useful for debugging so we should consider > dropping them from error messages. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4675) Reduce the size of pretty string of BQ load jobs
Chamikara Jayalath created BEAM-4675: Summary: Reduce the size of pretty string of BQ load jobs Key: BEAM-4675 URL: https://issues.apache.org/jira/browse/BEAM-4675 Project: Beam Issue Type: Improvement Components: io-java-gcp Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Fix For: 2.6.0 Some of the error logs that contain BQ load jobs can be extremely large and drop the actual error message. Usually this happens due to 'schema' and/or ''sourceUris' of the job configuration of load jobs being very large. I think these properties are not that useful for debugging so we should consider dropping them from error messages. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4650) Add retry policy to Python BQ streaming sink
Chamikara Jayalath created BEAM-4650: Summary: Add retry policy to Python BQ streaming sink Key: BEAM-4650 URL: https://issues.apache.org/jira/browse/BEAM-4650 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Chamikara Jayalath Java supports specifying a retry policy when performing streaming writes to BQ: [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java] We should update Python BQ streaming sink to support this as well. https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L1430 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4617) Add a dependencies guide
[ https://issues.apache.org/jira/browse/BEAM-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath resolved BEAM-4617. -- Resolution: Fixed Fix Version/s: Not applicable > Add a dependencies guide > > > Key: BEAM-4617 > URL: https://issues.apache.org/jira/browse/BEAM-4617 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Major > Fix For: Not applicable > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Initial discussion: > https://lists.apache.org/thread.html/8738c13ad7e576bc2fef158d2cc6f809e1c238ab8d5164c78484bf54@%3Cdev.beam.apache.org%3E > Vote: > https://lists.apache.org/thread.html/8b9b3768adfc40d3527d1ce5e8a51d90e5782a348a3abfb9e5dc85ef@%3Cdev.beam.apache.org%3E > Doc: > https://docs.google.com/document/d/15m1MziZ5TNd9rh_XN0YYBJfYkt0Oj-Ou9g0KFDPL2aA/edit -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4617) Add a dependencies guide
Chamikara Jayalath created BEAM-4617: Summary: Add a dependencies guide Key: BEAM-4617 URL: https://issues.apache.org/jira/browse/BEAM-4617 Project: Beam Issue Type: Improvement Components: website Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Initial discussion: https://lists.apache.org/thread.html/8738c13ad7e576bc2fef158d2cc6f809e1c238ab8d5164c78484bf54@%3Cdev.beam.apache.org%3E Vote: https://lists.apache.org/thread.html/8b9b3768adfc40d3527d1ce5e8a51d90e5782a348a3abfb9e5dc85ef@%3Cdev.beam.apache.org%3E Doc: https://docs.google.com/document/d/15m1MziZ5TNd9rh_XN0YYBJfYkt0Oj-Ou9g0KFDPL2aA/edit -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4512) Move DataflowRunner off of Maven build files
[ https://issues.apache.org/jira/browse/BEAM-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516584#comment-16516584 ] Chamikara Jayalath commented on BEAM-4512: -- Assigning to Mark whose looking into this. > Move DataflowRunner off of Maven build files > > > Key: BEAM-4512 > URL: https://issues.apache.org/jira/browse/BEAM-4512 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Chamikara Jayalath >Assignee: Mark Liu >Priority: Major > > Currently DataflowRunner (internally at Google) depends on Beam's Maven build > files. We have to move some internal build targets to use Gradle so that > Maven files can be deleted from Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4512) Move DataflowRunner off of Maven build files
[ https://issues.apache.org/jira/browse/BEAM-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4512: Assignee: Mark Liu (was: Chamikara Jayalath) > Move DataflowRunner off of Maven build files > > > Key: BEAM-4512 > URL: https://issues.apache.org/jira/browse/BEAM-4512 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Chamikara Jayalath >Assignee: Mark Liu >Priority: Major > > Currently DataflowRunner (internally at Google) depends on Beam's Maven build > files. We have to move some internal build targets to use Gradle so that > Maven files can be deleted from Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4535) Python tests are failing for Windows
Chamikara Jayalath created BEAM-4535: Summary: Python tests are failing for Windows Key: BEAM-4535 URL: https://issues.apache.org/jira/browse/BEAM-4535 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Chamikara Jayalath Assignee: Udi Meiri Error is: Traceback (most recent call last): File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba sedsource_test.py", line 532, in test_read_auto_pattern compression_type=CompressionTypes.AUTO)) File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba sedsource.py", line 119, in __init__ self._validate() File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\options\v alue_provider.py", line 133, in _f return fnc(self, *args, **kwargs) File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba sedsource.py", line 179, in _validate 'No files found based on the file pattern %s' % pattern) IOError: No files found based on the file pattern c:\windows\temp\tmpwon5_g\mytemp* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4512) Move DataflowRunner off of Maven build files
Chamikara Jayalath created BEAM-4512: Summary: Move DataflowRunner off of Maven build files Key: BEAM-4512 URL: https://issues.apache.org/jira/browse/BEAM-4512 Project: Beam Issue Type: Sub-task Components: runner-dataflow Reporter: Chamikara Jayalath Assignee: Chamikara Jayalath Currently DataflowRunner (internally at Google) depends on Beam's Maven build files. We have to move some internal build targets to use Gradle so that Maven files can be deleted from Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502169#comment-16502169 ] Chamikara Jayalath commented on BEAM-3788: -- Email thread regarding this: https://lists.apache.org/thread.html/b806fcc2079fbec7bd7ae1dc619c3ff71e33c0aa51555e60f4081013@%3Cdev.beam.apache.org%3E Doc: https://docs.google.com/document/d/1ogRS-e-HYYTHsXi_l2zDUUOnvfzEbub3BFkPrYIOawU/edit > Implement a Kafka IO for Python SDK > --- > > Key: BEAM-3788 > URL: https://issues.apache.org/jira/browse/BEAM-3788 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath >Priority: Major > > This will be implemented using the Splittable DoFn framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4444) Parquet IO for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500858#comment-16500858 ] Chamikara Jayalath commented on BEAM-: -- Hi Bruce, is this something you hope to add ? Just curious. > Parquet IO for Python SDK > - > > Key: BEAM- > URL: https://issues.apache.org/jira/browse/BEAM- > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Bruce Arctor >Assignee: Chamikara Jayalath >Priority: Major > > Add Parquet Support for the Python SDK. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3098) Upgrade Java grpc version
[ https://issues.apache.org/jira/browse/BEAM-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497475#comment-16497475 ] Chamikara Jayalath commented on BEAM-3098: -- We are actively looking into this now. > Upgrade Java grpc version > - > > Key: BEAM-3098 > URL: https://issues.apache.org/jira/browse/BEAM-3098 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Solomon Duskis >Assignee: Chamikara Jayalath >Priority: Major > > Beam Java currently depends on grpc 1.2, which was released in March. It > would be great if the dependency could be update to something newer, like > grpc 1.7.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3098) Upgrade Java grpc version
[ https://issues.apache.org/jira/browse/BEAM-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-3098: Assignee: Chamikara Jayalath > Upgrade Java grpc version > - > > Key: BEAM-3098 > URL: https://issues.apache.org/jira/browse/BEAM-3098 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Solomon Duskis >Assignee: Chamikara Jayalath >Priority: Major > > Beam Java currently depends on grpc 1.2, which was released in March. It > would be great if the dependency could be update to something newer, like > grpc 1.7.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4254) Upgrade Bigtable client to 1.3
[ https://issues.apache.org/jira/browse/BEAM-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-4254: - Fix Version/s: (was: 2.5.0) > Upgrade Bigtable client to 1.3 > -- > > Key: BEAM-4254 > URL: https://issues.apache.org/jira/browse/BEAM-4254 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Valentyn Tymofieiev >Assignee: Chamikara Jayalath >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4254) Upgrade Bigtable client to 1.3
[ https://issues.apache.org/jira/browse/BEAM-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479877#comment-16479877 ] Chamikara Jayalath commented on BEAM-4254: -- cc: [~sduskis] Looks like this requires upgrading protobuf and gRPC dependencies which is planned but cannot be done before 2.5.0. > Upgrade Bigtable client to 1.3 > -- > > Key: BEAM-4254 > URL: https://issues.apache.org/jira/browse/BEAM-4254 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Valentyn Tymofieiev >Assignee: Chamikara Jayalath >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts
[ https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4257: Assignee: (was: Kenneth Knowles) > Add error reason and table destination to BigQueryIO streaming failed inserts > - > > Key: BEAM-4257 > URL: https://issues.apache.org/jira/browse/BEAM-4257 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Carlos Alonso >Priority: Minor > > When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we > get a `PCollection` which is fine, but in order to properly work on > the errors downstream having extended information such as the `InsertError` > fields and the `TableReference` it was routed to would be really valuable. > > My suggestion is to create a new object that contains all that information > and return a `PCollection` of those instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts
[ https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469535#comment-16469535 ] Chamikara Jayalath commented on BEAM-4257: -- Carlos, looks like you are working on this. Please contact a PMC in dev list or Slack to get the JIRA contributor role added to your account. > Add error reason and table destination to BigQueryIO streaming failed inserts > - > > Key: BEAM-4257 > URL: https://issues.apache.org/jira/browse/BEAM-4257 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Carlos Alonso >Assignee: Kenneth Knowles >Priority: Minor > > When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we > get a `PCollection` which is fine, but in order to properly work on > the errors downstream having extended information such as the `InsertError` > fields and the `TableReference` it was routed to would be really valuable. > > My suggestion is to create a new object that contains all that information > and return a `PCollection` of those instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts
[ https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-4257: - Component/s: (was: sdk-java-core) io-java-gcp > Add error reason and table destination to BigQueryIO streaming failed inserts > - > > Key: BEAM-4257 > URL: https://issues.apache.org/jira/browse/BEAM-4257 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Carlos Alonso >Assignee: Kenneth Knowles >Priority: Minor > > When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we > get a `PCollection` which is fine, but in order to properly work on > the errors downstream having extended information such as the `InsertError` > fields and the `TableReference` it was routed to would be really valuable. > > My suggestion is to create a new object that contains all that information > and return a `PCollection` of those instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4261) CloudBigtableIO should not try to validate runtime parameters at construction time.
[ https://issues.apache.org/jira/browse/BEAM-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469440#comment-16469440 ] Chamikara Jayalath commented on BEAM-4261: -- Could you mention which parameters you want not to be validated ? tableId ? If it's just a matter of machine that submitting the job not having access to Spanner, I think withoutValidation() is the proper solution since validation will work for some users. > CloudBigtableIO should not try to validate runtime parameters at construction > time. > --- > > Key: BEAM-4261 > URL: https://issues.apache.org/jira/browse/BEAM-4261 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Si >Assignee: Chamikara Jayalath >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > The workaround for user is to have some default values set and override them > at runtime. > One example of validating runtime parameter at construction time is > following, and there are could be more. > > @Override > public void validate() { > ValueProvider tableId = config.getTableId(); > checkArgument(tableId != null && tableId.isAccessible() && > !tableId.get().isEmpty(), > "tableId was not supplied"); > } > > A reported issue on stackoverflow: > [https://stackoverflow.com/questions/49595921/valueprovider-type-parameters-not-getting-honored-at-the-template-execution-time] > > One concern I have is that if we disable the validation at construction time, > how do we validate it at runtime? Ideally, users should use template > parameter metadata for validation, but that is optional. > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3944) Convert beam_PerformanceTests_Python to use Gradle
[ https://issues.apache.org/jira/browse/BEAM-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469273#comment-16469273 ] Chamikara Jayalath commented on BEAM-3944: -- cc: [~altay] [~markflyhigh] > Convert beam_PerformanceTests_Python to use Gradle > -- > > Key: BEAM-3944 > URL: https://issues.apache.org/jira/browse/BEAM-3944 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Chamikara Jayalath >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4261) CloudBigtableIO should not try to validate runtime parameters at construction time.
[ https://issues.apache.org/jira/browse/BEAM-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469260#comment-16469260 ] Chamikara Jayalath commented on BEAM-4261: -- Have you tried using withoutValidation() ? > CloudBigtableIO should not try to validate runtime parameters at construction > time. > --- > > Key: BEAM-4261 > URL: https://issues.apache.org/jira/browse/BEAM-4261 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Kevin Si >Assignee: Chamikara Jayalath >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > The workaround for user is to have some default values set and override them > at runtime. > One example of validating runtime parameter at construction time is > following, and there are could be more. > > @Override > public void validate() { > ValueProvider tableId = config.getTableId(); > checkArgument(tableId != null && tableId.isAccessible() && > !tableId.get().isEmpty(), > "tableId was not supplied"); > } > > A reported issue on stackoverflow: > [https://stackoverflow.com/questions/49595921/valueprovider-type-parameters-not-getting-honored-at-the-template-execution-time] > > One concern I have is that if we disable the validation at construction time, > how do we validate it at runtime? Ideally, users should use template > parameter metadata for validation, but that is optional. > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4265) Add a dead letter queue to Python streaming BigQuery sink
Chamikara Jayalath created BEAM-4265: Summary: Add a dead letter queue to Python streaming BigQuery sink Key: BEAM-4265 URL: https://issues.apache.org/jira/browse/BEAM-4265 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Chamikara Jayalath When writing to BigQuery using streaming writes, Java SDK supports writing failed records to a dead letter queue: [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1375] This is a very useful feature for long running pipelines so we should add this to Python BQ sink: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L1279 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"
[ https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469202#comment-16469202 ] Chamikara Jayalath commented on BEAM-4264: -- Looks like this is due to https://github.com/apache/beam/pull/4264 > Java PostCommit Spanner tests are failing due to "Instance not found" > - > > Key: BEAM-4264 > URL: https://issues.apache.org/jira/browse/BEAM-4264 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Mairbek Khadikov >Priority: Blocker > Fix For: 2.5.0 > > > First failure triggered by the commit: > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/] > > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/] > > Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: > projects/apache-beam-testing/instances/mairbek-deleteme resource_type: > "type.googleapis.com/google.spanner.admin.instance.v1.Instance" > resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" > description: "Instance does not exist." at > io.grpc.Status.asRuntimeException(Status.java:540) at > io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100) > at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190) > at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at > io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546) > at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at > io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4263) BigQuery connector reads the table size value from a deprecated field
[ https://issues.apache.org/jira/browse/BEAM-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469162#comment-16469162 ] Chamikara Jayalath commented on BEAM-4263: -- [~kjung520] I assume your are working on this. Please ask in Beam Slack/dev list so that a PMC member can assign JIRA contributor role to you. > BigQuery connector reads the table size value from a deprecated field > - > > Key: BEAM-4263 > URL: https://issues.apache.org/jira/browse/BEAM-4263 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 3.0.0, 2.5.0 >Reporter: Kenneth Jung >Priority: Minor > > The BigQuery connector in the GCP IO module reads the totalBytesProcessed > value from a deprecated field in the job statistics: > [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs] > The non-deprecated replacement is the totalBytesProcessed field in the query > statistics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4263) BigQuery connector reads the table size value from a deprecated field
[ https://issues.apache.org/jira/browse/BEAM-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4263: Assignee: (was: Chamikara Jayalath) > BigQuery connector reads the table size value from a deprecated field > - > > Key: BEAM-4263 > URL: https://issues.apache.org/jira/browse/BEAM-4263 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 3.0.0, 2.5.0 >Reporter: Kenneth Jung >Priority: Minor > > The BigQuery connector in the GCP IO module reads the totalBytesProcessed > value from a deprecated field in the job statistics: > [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs] > The non-deprecated replacement is the totalBytesProcessed field in the query > statistics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"
[ https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469150#comment-16469150 ] Chamikara Jayalath commented on BEAM-4264: -- cc: [~jkff] > Java PostCommit Spanner tests are failing due to "Instance not found" > - > > Key: BEAM-4264 > URL: https://issues.apache.org/jira/browse/BEAM-4264 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Mairbek Khadikov >Priority: Blocker > Fix For: 2.5.0 > > > First failure triggered by the commit: > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/] > > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/] > > Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: > projects/apache-beam-testing/instances/mairbek-deleteme resource_type: > "type.googleapis.com/google.spanner.admin.instance.v1.Instance" > resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" > description: "Instance does not exist." at > io.grpc.Status.asRuntimeException(Status.java:540) at > io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100) > at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190) > at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at > io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546) > at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at > io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"
[ https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-4264: - Issue Type: Bug (was: New Feature) > Java PostCommit Spanner tests are failing due to "Instance not found" > - > > Key: BEAM-4264 > URL: https://issues.apache.org/jira/browse/BEAM-4264 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Mairbek Khadikov >Priority: Blocker > Fix For: 2.5.0 > > > First failure triggered by the commit: > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/] > > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/] > > Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: > projects/apache-beam-testing/instances/mairbek-deleteme resource_type: > "type.googleapis.com/google.spanner.admin.instance.v1.Instance" > resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" > description: "Instance does not exist." at > io.grpc.Status.asRuntimeException(Status.java:540) at > io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100) > at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190) > at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at > io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546) > at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at > io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"
[ https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469148#comment-16469148 ] Chamikara Jayalath commented on BEAM-4264: -- Mairbek, can you fix or revert the PR ? > Java PostCommit Spanner tests are failing due to "Instance not found" > - > > Key: BEAM-4264 > URL: https://issues.apache.org/jira/browse/BEAM-4264 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Chamikara Jayalath >Assignee: Mairbek Khadikov >Priority: Blocker > Fix For: 2.5.0 > > > First failure triggered by the commit: > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/] > > [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/] > > Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: > projects/apache-beam-testing/instances/mairbek-deleteme resource_type: > "type.googleapis.com/google.spanner.admin.instance.v1.Instance" > resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" > description: "Instance does not exist." at > io.grpc.Status.asRuntimeException(Status.java:540) at > io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100) > at > io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) > at > com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190) > at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at > io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431) > at > io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546) > at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at > io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"
Chamikara Jayalath created BEAM-4264: Summary: Java PostCommit Spanner tests are failing due to "Instance not found" Key: BEAM-4264 URL: https://issues.apache.org/jira/browse/BEAM-4264 Project: Beam Issue Type: New Feature Components: io-java-gcp Reporter: Chamikara Jayalath Assignee: Mairbek Khadikov Fix For: 2.5.0 First failure triggered by the commit: [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/] [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/] Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: projects/apache-beam-testing/instances/mairbek-deleteme resource_type: "type.googleapis.com/google.spanner.admin.instance.v1.Instance" resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" description: "Instance does not exist." at io.grpc.Status.asRuntimeException(Status.java:540) at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) at com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56) at com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190) at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3516) SpannerWriteGroupFn does not respect mutation limits
[ https://issues.apache.org/jira/browse/BEAM-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-3516: Assignee: Mairbek Khadikov (was: Chamikara Jayalath) > SpannerWriteGroupFn does not respect mutation limits > > > Key: BEAM-3516 > URL: https://issues.apache.org/jira/browse/BEAM-3516 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.2.0 >Reporter: Ryan Gordon >Assignee: Mairbek Khadikov >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > When using SpannerIO.write(), if it happens to be a large batch or a table > with indexes its very possible it can hit the Spanner Mutations Limitation > and fail with the following error: > {quote}Jan 02, 2018 2:42:59 PM > org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process > SEVERE: 2018-01-02T22:42:57.873Z: (3e7c871d215e890b): > com.google.cloud.spanner.SpannerException: INVALID_ARGUMENT: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The transaction contains > too many mutations. Insert and update operations count with the multiplicity > of the number of columns they affect. For example, inserting values into one > key column and four non-key columns count as five mutations total for the > insert. Delete and delete range operations count as one mutation regardless > of the number of columns affected. The total mutation count includes any > changes to indexes that the transaction generates. Please reduce the number > of writes, or use fewer indexes. (Maximum number: 2) > links { > description: "Cloud Spanner limits documentation." > url: "https://cloud.google.com/spanner/docs/limits"; > } > at > com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:119) > at > com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:43) > at > com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:80) > at > com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.get(GrpcSpannerRpc.java:404) > at > com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.commit(GrpcSpannerRpc.java:376) > at > com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:729) > at > com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:726) > at com.google.cloud.spanner.SpannerImpl.runWithRetries(SpannerImpl.java:200) > at > com.google.cloud.spanner.SpannerImpl$SessionImpl.writeAtLeastOnce(SpannerImpl.java:725) > at > com.google.cloud.spanner.SessionPool$PooledSession.writeAtLeastOnce(SessionPool.java:248) > at > com.google.cloud.spanner.DatabaseClientImpl.writeAtLeastOnce(DatabaseClientImpl.java:37) > at > org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.flushBatch(SpannerWriteGroupFn.java:108) > at > org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.processElement(SpannerWriteGroupFn.java:79) > {quote} > > As a workaround we can override the "withBatchSizeBytes" to something much > smaller: > {quote}mutations.apply("Write", SpannerIO > .write() > // Artificially reduce the max batch size b/c the batcher currently doesn't > // take into account the 2 mutation multiplicity limit > .withBatchSizeBytes(1024) // 1KB > .withProjectId("#PROJECTID#") > .withInstanceId("#INSTANCE#") > .withDatabaseId("#DATABASE#") > ); > {quote} > While this is not as efficient, it at least allows it to work consistently -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4248) Upgrade Bigquery to com.google.cloud library
[ https://issues.apache.org/jira/browse/BEAM-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath reassigned BEAM-4248: Assignee: (was: Chamikara Jayalath) > Upgrade Bigquery to com.google.cloud library > > > Key: BEAM-4248 > URL: https://issues.apache.org/jira/browse/BEAM-4248 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp >Reporter: Andrew Pilloud >Priority: Major > > Bigquery is using the really old com.google.api.services client library. We > should upgrade to the com.google.cloud version which includes new features > and ENUMs for all the constants. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4244) Provide a better way for programmatically handling errors raised while encoding/decoding data
Chamikara Jayalath created BEAM-4244: Summary: Provide a better way for programmatically handling errors raised while encoding/decoding data Key: BEAM-4244 URL: https://issues.apache.org/jira/browse/BEAM-4244 Project: Beam Issue Type: New Feature Components: beam-model, runner-core Reporter: Chamikara Jayalath Beam runners use coders in various stages of a pipeline to encode/decode data. Coders are executed directly by the runner of a pipeline and user do not have control over exceptions raised during encoding/decoding (could be either due to malformed/corrupted data provided by users or intermediate malformed/corrupted data generated during the system execution). Currently users can rely on runner-specific worker logging to detect the error and update the pipeline but it would be better if we can provide a way to programmatically handle these errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3973) Allow to disable batch API in SpannerIO
[ https://issues.apache.org/jira/browse/BEAM-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chamikara Jayalath updated BEAM-3973: - Priority: Blocker (was: Major) > Allow to disable batch API in SpannerIO > --- > > Key: BEAM-3973 > URL: https://issues.apache.org/jira/browse/BEAM-3973 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Mairbek Khadikov >Assignee: Mairbek Khadikov >Priority: Blocker > Fix For: 2.5.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API > provides abstractions to scale out reads from Spanner, but it requires the > query to be root-partitionable. The root-partitionable queries cover majority > of the use cases, however there are examples when running arbitrary query is > useful. For example, reading all the table names from the > information_schema.* and reading the content of those tables in the next > step. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3973) Allow to disable batch API in SpannerIO
[ https://issues.apache.org/jira/browse/BEAM-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461708#comment-16461708 ] Chamikara Jayalath commented on BEAM-3973: -- I think this is a 2.5.0 blocker. Mairbek, can you confirm ? > Allow to disable batch API in SpannerIO > --- > > Key: BEAM-3973 > URL: https://issues.apache.org/jira/browse/BEAM-3973 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Mairbek Khadikov >Assignee: Mairbek Khadikov >Priority: Major > Fix For: 2.5.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API > provides abstractions to scale out reads from Spanner, but it requires the > query to be root-partitionable. The root-partitionable queries cover majority > of the use cases, however there are examples when running arbitrary query is > useful. For example, reading all the table names from the > information_schema.* and reading the content of those tables in the next > step. -- This message was sent by Atlassian JIRA (v7.6.3#76005)