[jira] [Assigned] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2018-10-10 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4543:


Assignee: Udi Meiri  (was: Valentyn Tymofieiev)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly

2018-10-09 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644101#comment-16644101
 ] 

Chamikara Jayalath edited comment on BEAM-5514 at 10/9/18 9:35 PM:
---

Thanks.

I believe HTTP 403 issues in general are considered non-retriable. So it makes 
sense for Dataflow to not to retry requests at the client. In-fact BigQuery 
support page provides following instructions regarding HTTP 403 quotaExceeded 
errors.

https://cloud.google.com/bigquery/troubleshooting-errors

"View the {{message}} property of the error object for more information about 
which quota was exceeded. To reset or raise a BigQuery quota, [contact 
support|https://cloud.google.com/support]. To modify a custom quota, submit a 
request from the [Google Cloud Platform 
Console|https://console.cloud.google.com/iam-admin/quotas] page.";

So basically this is asking to fix the issue (request a quota increase) before 
retrying.

The issues is, due to the architecture if Dataflow streaming jobs, even though 
we do not retry at the client, we do in fact retry all work items indefinitely.

 

So we end up sending a large number of requests to BigQuery whenever a user hit 
quota errors.

 

 

 


was (Author: chamikara):
Kevin,

Sounds like 

> BigQueryIO doesn't handle quotaExceeded errors properly
> ---
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Peterson
>Assignee: Reuven Lax
>Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly

2018-10-09 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644101#comment-16644101
 ] 

Chamikara Jayalath commented on BEAM-5514:
--

Kevin,

Sounds like 

> BigQueryIO doesn't handle quotaExceeded errors properly
> ---
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Peterson
>Assignee: Reuven Lax
>Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5456) Update google-api-client libraries to 1.25

2018-10-09 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-5456:
-
Fix Version/s: (was: 2.8.0)
   2.9.0

> Update google-api-client libraries to 1.25
> --
>
> Key: BEAM-5456
> URL: https://issues.apache.org/jira/browse/BEAM-5456
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.9.0
>
>
> This version updates authentication URLs 
> ([https://github.com/googleapis/google-api-java-client/releases)] that is 
> needed for certain features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5456) Update google-api-client libraries to 1.25

2018-10-09 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643879#comment-16643879
 ] 

Chamikara Jayalath commented on BEAM-5456:
--

Moved to 2.9.0.

> Update google-api-client libraries to 1.25
> --
>
> Key: BEAM-5456
> URL: https://issues.apache.org/jira/browse/BEAM-5456
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.9.0
>
>
> This version updates authentication URLs 
> ([https://github.com/googleapis/google-api-java-client/releases)] that is 
> needed for certain features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly

2018-10-09 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643834#comment-16643834
 ] 

Chamikara Jayalath edited comment on BEAM-5514 at 10/9/18 6:06 PM:
---

I'm trying to determine the priority at which this should be addressed.

[~reuvenlax] any reason why we rely on workitems retries instead of retrying BQ 
streaming write requests with exponential backoff ?


was (Author: chamikara):
I'm trying to determine the priority at which this should be addressed.

 

[~reuvenlax] any reason why do rely on workitems retries instead of retrying BQ 
streaming write requests with exponential backoff ?

> BigQueryIO doesn't handle quotaExceeded errors properly
> ---
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Peterson
>Assignee: Reuven Lax
>Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly

2018-10-09 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5514:


Assignee: Reuven Lax  (was: Chamikara Jayalath)

> BigQueryIO doesn't handle quotaExceeded errors properly
> ---
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Peterson
>Assignee: Reuven Lax
>Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5514) BigQueryIO doesn't handle quotaExceeded errors properly

2018-10-09 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643834#comment-16643834
 ] 

Chamikara Jayalath commented on BEAM-5514:
--

I'm trying to determine the priority at which this should be addressed.

 

[~reuvenlax] any reason why do rely on workitems retries instead of retrying BQ 
streaming write requests with exponential backoff ?

> BigQueryIO doesn't handle quotaExceeded errors properly
> ---
>
> Key: BEAM-5514
> URL: https://issues.apache.org/jira/browse/BEAM-5514
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Peterson
>Assignee: Chamikara Jayalath
>Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5670) Add more integration tests for BigQueryIO

2018-10-05 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5670:


 Summary: Add more integration tests for BigQueryIO
 Key: BEAM-5670
 URL: https://issues.apache.org/jira/browse/BEAM-5670
 Project: Beam
  Issue Type: Test
  Components: io-java-gcp, testing
Reporter: Chamikara Jayalath
Assignee: Pablo Estrada


Seems like we currently only have a single test that directly read using a 
query.

[https://github.com/apache/beam/blob/328129bf033bc6be16bc8e09af905f37b7516412/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java]

 

We should consider adding more integration tests. For example,

 

(1) Read directly from a given table and a dataset.

(2) Read from a federated table.

(3) Read using BQ legacy SQL.

(4) Read from a table with nested/repeated fields.

(5) Read from non-standard BQ regions (for example, Japan).

 

Also, we should consider adding tests for BQ streaming writes once we have 
framework support for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-04 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638584#comment-16638584
 ] 

Chamikara Jayalath commented on BEAM-5036:
--

Should this be marked as a blocker for 2.8.0 ? PR is still in review.

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5342) Migrate google-api-client libraries to 1.24.1

2018-10-01 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-5342.
--
   Resolution: Fixed
Fix Version/s: 2.8.0

> Migrate google-api-client libraries to 1.24.1
> -
>
> Key: BEAM-5342
> URL: https://issues.apache.org/jira/browse/BEAM-5342
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-dataflow
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We currently use 1.23 libraries which is about an year old. We should migrate 
> to more recent 1.24.1 which fixes several known issues.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5517) Update Python BigQuery source to use fastavro module to read exported data

2018-09-26 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5517:


 Summary: Update Python BigQuery source to use fastavro module to 
read exported data
 Key: BEAM-5517
 URL: https://issues.apache.org/jira/browse/BEAM-5517
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently we use avro module for reading data exported by BigQuery. Moving to 
fastavro should result in a significant performance boost.

Creating this Jira for tracking but most of the changes should be in the runner 
(DataflowRunner) since we current BigQuery source is a native source.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5408) (Java) Using Compression.GZIP with TFRecordIO

2018-09-20 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-5408.
--
   Resolution: Fixed
Fix Version/s: 2.8.0

> (Java) Using Compression.GZIP with TFRecordIO
> -
>
> Key: BEAM-5408
> URL: https://issues.apache.org/jira/browse/BEAM-5408
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: haden lee
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.8.0
>
>
> In short, `TFRecrdIO.read()` does not seem to work if the entry being read is 
> longer than 8,192 (in terms of byte[] length).  `TFRecordIO.write()` seems to 
> be OK with this though (based on some experiments). Perhaps there is some 
> hard-coded value for this specific length somewhere in the SDK, and I'm 
> wondering if it can be increased or parameterized. 
> [I've posted this on 
> StackOverflow|https://stackoverflow.com/questions/52284639/beam-java-sdk-with-tfrecord-and-compression-gzip],
>  but I was advised to report it here.
> Here are the details:
> We're using Beam Java SDK (and Google Cloud Dataflow to run batch jobs) a 
> lot, and we noticed something weird (possibly a bug?) when we tried to use 
> `TFRecordIO` with `Compression.GZIP`. We were able to come up with some 
> sample code that can reproduce the errors we face.
> To be clear, we are using Beam Java SDK 2.4.
> Suppose we have `PCollection` which can be a PC of proto messages, 
> for instance, in byte[] format.
>  We usually write this to GCS (Google Cloud Storage) using Base64 encoding 
> (newline delimited Strings) or using TFRecordIO (without compression). We 
> have had no issue reading the data from GCS in this manner for a very long 
> time (2.5+ years for the former and ~1.5 years for the latter).
> Recently, we tried `TFRecordIO` with `Compression.GZIP` option, and 
> *sometimes* we get an exception as the data is seen as invalid (while being 
> read). The data itself (the gzip files) is not corrupted, and we've tested 
> various things, and reached the following conclusion.
> When a `byte[]` that is being compressed under `TFRecordIO` is above certain 
> threshold (I'd say when at or above 8192), then 
> `TFRecordIO.read().withCompression(Compression.GZIP)` would not work.
>  Specifically, it will throw the following exception:
>  
> {code:java}
> // code placeholder
> Exception in thread "main" java.lang.IllegalStateException: Invalid data
> at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
> at org.apache.beam.sdk.io.TFRecordIO$TFRecordCodec.read(TFRecordIO.java:642)
> at 
> org.apache.beam.sdk.io.TFRecordIO$TFRecordSource$TFRecordReader.readNextRecord(TFRecordIO.java:526)
> at 
> org.apache.beam.sdk.io.CompressedSource$CompressedReader.readNextRecord(CompressedSource.java:426)
> at 
> org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:473)
> at 
> org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:468)
> at 
> org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:261)
> at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:141)
> at 
> org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:161)
> at 
> org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:125)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> This can be reproduced easily, so you can refer to the code at the end. You 
> will also see comments about the byte array length (as I tested with various 
> sizes, I concluded that 8192 is the magic number).
> So I'm wondering if this is a bug or known issue – I couldn't find anything 
> close to this on Apache Beam's Issue Tracker [here][1] but if there is 
> another forum/site I need to check, please let me know!
>  If this is indeed a bug, what would be the right channel to report this?
> —
>  The following code can reproduce the error we have.
> A successful run (with parameters 1, 39, 100) would show the following 
> message at the end:
> {code:java}
> // code placeholder
>  counter metrics from CountDoFn
> [counter] plain_base64_proto_array_len: 8126
> [counter] plain_base64_proto_in: 1
> [counter] plain_base64_proto_val_cnt: 39
> [counter] tfrecord_gz_proto_

[jira] [Resolved] (BEAM-5412) TFRecordIO fails with records larger than 8K

2018-09-20 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-5412.
--
   Resolution: Fixed
Fix Version/s: 2.8.0

> TFRecordIO fails with records larger than 8K
> 
>
> Key: BEAM-5412
> URL: https://issues.apache.org/jira/browse/BEAM-5412
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-text
>Affects Versions: 2.4.0
>Reporter: Raghu Angadi
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This was reported on 
> [Stackoverflow|https://stackoverflow.com/questions/52284639/beam-java-sdk-with-tfrecord-and-compression-gzip].
>  TFRecordIO reader assumes a single call to {{channel.read()}} returns as 
> much as can fit in the input buffer. {{read()}} can return fewer bytes than 
> requested. Assert failure : 
> https://github.com/apache/beam/blob/release-2.4.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L642



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5456) Update google-api-client libraries to 1.25

2018-09-20 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5456:


 Summary: Update google-api-client libraries to 1.25
 Key: BEAM-5456
 URL: https://issues.apache.org/jira/browse/BEAM-5456
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath
 Fix For: 2.8.0


This version updates authentication URLs 
([https://github.com/googleapis/google-api-java-client/releases)] that is 
needed for certain features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5445) Update SpannerIO to support unbounded writes

2018-09-20 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5445:


 Summary: Update SpannerIO to support unbounded writes
 Key: BEAM-5445
 URL: https://issues.apache.org/jira/browse/BEAM-5445
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath


Currently, due to a known issue, streaming pipelines that use SpannerIO.Write 
do not actually write to Spanner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"

2018-09-20 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-5432.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> beam-runners-direct-java fails to build due to "cannot find symbol ... 
> symbol:   method create(JobInfo)"
> 
>
> Key: BEAM-5432
> URL: https://issues.apache.org/jira/browse/BEAM-5432
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Chamikara Jayalath
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>
> Seems to be due to 
> [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151]
> ./gradlew :beam-runners-direct-java:build passes without about PR but fails 
> with following error with it.
>  
> > Task :beam-runners-direct-java:compileJava FAILED
>  
> /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
>  error: cannot find symbol
>  return DockerJobBundleFactory.create(jobInfo);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"

2018-09-20 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622402#comment-16622402
 ] 

Chamikara Jayalath commented on BEAM-5432:
--

Thanks. Closing this.

> beam-runners-direct-java fails to build due to "cannot find symbol ... 
> symbol:   method create(JobInfo)"
> 
>
> Key: BEAM-5432
> URL: https://issues.apache.org/jira/browse/BEAM-5432
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Chamikara Jayalath
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>
> Seems to be due to 
> [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151]
> ./gradlew :beam-runners-direct-java:build passes without about PR but fails 
> with following error with it.
>  
> > Task :beam-runners-direct-java:compileJava FAILED
>  
> /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
>  error: cannot find symbol
>  return DockerJobBundleFactory.create(jobInfo);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"

2018-09-19 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-5432:
-
Description: 
Seems to be due to 
[https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151]

./gradlew :beam-runners-direct-java:build passes without about PR but fails 
with following error with it.

 

> Task :beam-runners-direct-java:compileJava FAILED
 
/usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
 error: cannot find symbol
 return DockerJobBundleFactory.create(jobInfo);

  was:
Seems to be due to [https://github.com/apache/beam/pull/6151.]

./gradlew :beam-runners-direct-java:build passes without about PR but fails 
with following error with it.

 

> Task :beam-runners-direct-java:compileJava FAILED
/usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
 error: cannot find symbol
 return DockerJobBundleFactory.create(jobInfo);


> beam-runners-direct-java fails to build due to "cannot find symbol ... 
> symbol:   method create(JobInfo)"
> 
>
> Key: BEAM-5432
> URL: https://issues.apache.org/jira/browse/BEAM-5432
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Chamikara Jayalath
>Assignee: Daniel Oliveira
>Priority: Major
>
> Seems to be due to 
> [https://github.com/apache/beam/pull/6151.|https://github.com/apache/beam/pull/6151]
> ./gradlew :beam-runners-direct-java:build passes without about PR but fails 
> with following error with it.
>  
> > Task :beam-runners-direct-java:compileJava FAILED
>  
> /usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
>  error: cannot find symbol
>  return DockerJobBundleFactory.create(jobInfo);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5432) beam-runners-direct-java fails to build due to "cannot find symbol ... symbol: method create(JobInfo)"

2018-09-19 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5432:


 Summary: beam-runners-direct-java fails to build due to "cannot 
find symbol ... symbol:   method create(JobInfo)"
 Key: BEAM-5432
 URL: https://issues.apache.org/jira/browse/BEAM-5432
 Project: Beam
  Issue Type: Bug
  Components: runner-direct
Reporter: Chamikara Jayalath
Assignee: Daniel Oliveira


Seems to be due to [https://github.com/apache/beam/pull/6151.]

./gradlew :beam-runners-direct-java:build passes without about PR but fails 
with following error with it.

 

> Task :beam-runners-direct-java:compileJava FAILED
/usr/local/google/home/chamikara/testing/beam_test_09_19_2018/beam/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java:268:
 error: cannot find symbol
 return DockerJobBundleFactory.create(jobInfo);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619823#comment-16619823
 ] 

Chamikara Jayalath commented on BEAM-5426:
--

In that case, how about keeping track of load jobs for different destinations, 
and failing the job if we detect two load jobs for the same destination ? We 
should find a way to actively fail for this case, since currently this ends up 
being a silent data loss.

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5426:


Assignee: (was: Chamikara Jayalath)

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5426:


 Summary: Use both destination and TableDestination for BQ load job 
IDs
 Key: BEAM-5426
 URL: https://issues.apache.org/jira/browse/BEAM-5426
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently we use TableDestination when creating a unique load job ID for a 
destination: 
[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]

 

This can result in a data loss issue if a user returns the same 
TableDestination for different destination IDs. I think we can prevent this if 
we include both IDs in the BQ load job ID.

 

CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5422) Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable()

2018-09-18 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5422:


 Summary: Update BigQueryIO DynamicDestinations documentation to 
clarify usage of getDestination() and getTable()
 Key: BEAM-5422
 URL: https://issues.apache.org/jira/browse/BEAM-5422
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently, there are some details related to these methods that should be 
further clarified. For example, getTable() is expected to return a unique value 
for each destination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5410) Fail SpannerIO early for unsupported streaming mode

2018-09-17 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5410:


 Summary: Fail SpannerIO early for unsupported streaming mode
 Key: BEAM-5410
 URL: https://issues.apache.org/jira/browse/BEAM-5410
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Affects Versions: 2.8.0
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently SpannerIO does not support streaming mode. We should fail with a 
clear error till this is fixed and also update documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5364) BigtableIO source tries to validate table ID even though validation is turned off

2018-09-11 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611444#comment-16611444
 ] 

Chamikara Jayalath commented on BEAM-5364:
--

Kevin, is this a regression from 2.6.0 if not this should probably not be a 
release blocker. Nevertheless agree that we should fix this soon.

>  BigtableIO source tries to validate table ID even though validation is 
> turned off
> --
>
> Key: BEAM-5364
> URL: https://issues.apache.org/jira/browse/BEAM-5364
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Si
>Assignee: Chamikara Jayalath
>Priority: Blocker
>
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L1084|https://www.google.com/url?q=https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java%23L1084&sa=D&usg=AFQjCNEfHprTOvnwAwFSrXwUuLvc__JBWg]
> The validation can be turned off with following:
> BigtableIO.read()
>             .withoutValidation() // skip validation when constructing the 
> pipelline.
> A Dataflow template cannot be constructed due to this validation failure.
>  
> Error log when trying to construct a template:
> Exception in thread "main" java.lang.IllegalArgumentException: tableId was 
> not supplied
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.validate(BigtableIO.java:1084)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:95)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:85)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:167)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:423)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:179)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)
>         at 
> com.google.cloud.teleport.bigtable.BigtableToAvro.main(BigtableToAvro.java:89)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5342) Migrate google-api-client libraries to 1.24.1

2018-09-07 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5342:


 Summary: Migrate google-api-client libraries to 1.24.1
 Key: BEAM-5342
 URL: https://issues.apache.org/jira/browse/BEAM-5342
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp, runner-dataflow
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


We currently use 1.23 libraries which is about an year old. We should migrate 
to more recent 1.24.1 which fixes several known issues.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4417) BigqueryIO Numeric datatype Support

2018-09-06 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606518#comment-16606518
 ] 

Chamikara Jayalath commented on BEAM-4417:
--

Pablo is looking into this.

> BigqueryIO Numeric datatype Support
> ---
>
> Key: BEAM-4417
> URL: https://issues.apache.org/jira/browse/BEAM-4417
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Kishan Kumar
>Assignee: Pablo Estrada
>Priority: Critical
>  Labels: newbie, patch
> Fix For: 2.8.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> The BigQueryIO.read fails while parsing the data from the avro file generated 
> while reading the data from the table which has columns with *Numeric* 
> datatypes. 
> We have gone through the source code at Git-Hub and noticed that *Numeric 
> data type is not yet supported.* 
>  
> Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: 
> NUMERIC
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4417) BigqueryIO Numeric datatype Support

2018-09-06 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4417:


Assignee: Pablo Estrada  (was: Chamikara Jayalath)

> BigqueryIO Numeric datatype Support
> ---
>
> Key: BEAM-4417
> URL: https://issues.apache.org/jira/browse/BEAM-4417
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Kishan Kumar
>Assignee: Pablo Estrada
>Priority: Critical
>  Labels: newbie, patch
> Fix For: 2.8.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> The BigQueryIO.read fails while parsing the data from the avro file generated 
> while reading the data from the table which has columns with *Numeric* 
> datatypes. 
> We have gone through the source code at Git-Hub and noticed that *Numeric 
> data type is not yet supported.* 
>  
> Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: 
> NUMERIC
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3519) GCP IO exposes netty on its API surface, causing conflicts with runners

2018-09-04 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3519:


Assignee: Ismaël Mejía  (was: Chamikara Jayalath)

> GCP IO exposes netty on its API surface, causing conflicts with runners
> ---
>
> Key: BEAM-3519
> URL: https://issues.apache.org/jira/browse/BEAM-3519
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Google Cloud Platform IOs module leaks netty this causes conflicts in 
> particular with execution systems that use conflicting versions of such 
> modules. 
>  For the case there is a dependency conflict with the Spark Runner version of 
> netty, see: BEAM-3492



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3519) GCP IO exposes netty on its API surface, causing conflicts with runners

2018-09-04 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603172#comment-16603172
 ] 

Chamikara Jayalath commented on BEAM-3519:
--

Netty/gRPC/protobuf dependencies of Beam (including google-cloud-platform) were 
upgraded recently. So I suspect this is not an issue anymore. [~iemejia] can 
you confirm ?

> GCP IO exposes netty on its API surface, causing conflicts with runners
> ---
>
> Key: BEAM-3519
> URL: https://issues.apache.org/jira/browse/BEAM-3519
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ismaël Mejía
>Assignee: Chamikara Jayalath
>Priority: Critical
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Google Cloud Platform IOs module leaks netty this causes conflicts in 
> particular with execution systems that use conflicting versions of such 
> modules. 
>  For the case there is a dependency conflict with the Spark Runner version of 
> netty, see: BEAM-3492



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4977) Beam Dependency Update Request: io.dropwizard.metrics:metrics-core 4.1.0-rc2

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4977:


Assignee: Chamikara Jayalath  (was: Scott Wegner)

> Beam Dependency Update Request: io.dropwizard.metrics:metrics-core 4.1.0-rc2
> 
>
> Key: BEAM-4977
> URL: https://issues.apache.org/jira/browse/BEAM-4977
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:31:12.560286
> Please review and upgrade the io.dropwizard.metrics:metrics-core to 
> the latest version 4.1.0-rc2 
>  
> cc: 
> 2018-08-06 12:13:49.600351
> Please review and upgrade the io.dropwizard.metrics:metrics-core to 
> the latest version 4.1.0-rc2 
>  
> cc: 
> 2018-08-13 12:15:16.600478
> Please review and upgrade the io.dropwizard.metrics:metrics-core to 
> the latest version 4.1.0-rc2 
>  
> cc: 
> 2018-08-20 12:15:28.768620
> Please review and upgrade the io.dropwizard.metrics:metrics-core to 
> the latest version 4.1.0-rc2 
>  
> cc: 
> 2018-08-27 12:16:05.660353
> Please review and upgrade the io.dropwizard.metrics:metrics-core to 
> the latest version 4.1.0-rc2 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4952) Beam Dependency Update Request: org.apache.hbase:hbase-hadoop-compat 2.1.0

2018-08-27 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594257#comment-16594257
 ] 

Chamikara Jayalath commented on BEAM-4952:
--

Tim, a kind reminder about this and other dependency upgrade JIRAs assigned to 
you by the tool. Feel free to unassign if you don't have cycles to look into 
these.

> Beam Dependency Update Request: org.apache.hbase:hbase-hadoop-compat 2.1.0
> --
>
> Key: BEAM-4952
> URL: https://issues.apache.org/jira/browse/BEAM-4952
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tim Robertson
>Priority: Major
>
> 2018-07-25 20:28:24.987897
> Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to 
> the latest version 2.1.0 
>  
> cc: 
> 2018-08-06 12:11:58.406173
> Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to 
> the latest version 2.1.0 
>  
> cc: 
> 2018-08-13 12:13:31.045787
> Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to 
> the latest version 2.1.0 
>  
> cc: 
> 2018-08-20 12:14:04.735400
> Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to 
> the latest version 2.1.0 
>  
> cc: 
> 2018-08-27 12:15:07.483727
> Please review and upgrade the org.apache.hbase:hbase-hadoop-compat to 
> the latest version 2.1.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5229) Beam Dependency Update Request: com.commercehub.gradle.plugin:gradle-avro-plugin 0.15.0

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5229:


Assignee: Chamikara Jayalath  (was: Scott Wegner)

> Beam Dependency Update Request: 
> com.commercehub.gradle.plugin:gradle-avro-plugin 0.15.0
> ---
>
> Key: BEAM-5229
> URL: https://issues.apache.org/jira/browse/BEAM-5229
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-08-27 12:14:32.746008
> Please review and upgrade the 
> com.commercehub.gradle.plugin:gradle-avro-plugin to the latest version 0.15.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4905) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5

2018-08-27 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594250#comment-16594250
 ] 

Chamikara Jayalath commented on BEAM-4905:
--

[https://github.com/apache/beam/pull/6281] in review.

> Beam Dependency Update Request: 
> de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5
> -
>
> Key: BEAM-4905
> URL: https://issues.apache.org/jira/browse/BEAM-4905
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:23:58.022170
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 
> 2018-08-06 12:09:39.047955
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 
> 2018-08-13 12:09:56.445368
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 
> 2018-08-20 12:12:37.728222
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 
> 2018-08-27 12:14:06.408790
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5224) Beam Dependency Update Request: com.gradle:build-scan-plugin 1.16

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5224:


Assignee: Chamikara Jayalath  (was: Scott Wegner)

> Beam Dependency Update Request: com.gradle:build-scan-plugin 1.16
> -
>
> Key: BEAM-5224
> URL: https://issues.apache.org/jira/browse/BEAM-5224
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-08-27 12:13:32.215540
> Please review and upgrade the com.gradle:build-scan-plugin to the 
> latest version 1.16 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4874) Beam Dependency Update Request: com.google.auto.service:auto-service 1.0-rc4

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4874:


Assignee: Chamikara Jayalath  (was: Scott Wegner)

> Beam Dependency Update Request: com.google.auto.service:auto-service 1.0-rc4
> 
>
> Key: BEAM-4874
> URL: https://issues.apache.org/jira/browse/BEAM-4874
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:21:08.597317
> Please review and upgrade the com.google.auto.service:auto-service to 
> the latest version 1.0-rc4 
>  
> cc: 
> 2018-08-06 12:08:14.189802
> Please review and upgrade the com.google.auto.service:auto-service to 
> the latest version 1.0-rc4 
>  
> cc: 
> 2018-08-13 12:08:37.495569
> Please review and upgrade the com.google.auto.service:auto-service to 
> the latest version 1.0-rc4 
>  
> cc: 
> 2018-08-20 12:11:49.688170
> Please review and upgrade the com.google.auto.service:auto-service to 
> the latest version 1.0-rc4 
>  
> cc: 
> 2018-08-27 12:13:11.448132
> Please review and upgrade the com.google.auto.service:auto-service to 
> the latest version 1.0-rc4 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5182) Beam Dependency Update Request: org.assertj:assertj-core 3.11.0

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5182:


Assignee: Chamikara Jayalath  (was: Scott Wegner)

> Beam Dependency Update Request: org.assertj:assertj-core 3.11.0
> ---
>
> Key: BEAM-5182
> URL: https://issues.apache.org/jira/browse/BEAM-5182
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-08-20 12:11:47.500445
> Please review and upgrade the org.assertj:assertj-core to the latest 
> version 3.11.0 
>  
> cc: 
> 2018-08-27 12:13:06.123086
> Please review and upgrade the org.assertj:assertj-core to the latest 
> version 3.11.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-4982) Beam Dependency Update Request: io.netty:netty-transport-native-epoll 5.0.0.Alpha2

2018-08-27 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath closed BEAM-4982.

   Resolution: Won't Fix
Fix Version/s: Not applicable

> Beam Dependency Update Request: io.netty:netty-transport-native-epoll 
> 5.0.0.Alpha2
> --
>
> Key: BEAM-4982
> URL: https://issues.apache.org/jira/browse/BEAM-4982
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> 2018-07-25 20:31:42.182471
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-06 12:14:13.141909
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-13 12:15:42.691508
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-20 12:15:45.103065
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-27 12:16:18.187792
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4982) Beam Dependency Update Request: io.netty:netty-transport-native-epoll 5.0.0.Alpha2

2018-08-27 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594052#comment-16594052
 ] 

Chamikara Jayalath commented on BEAM-4982:
--

I don't this can be upgraded independently. We have to depend on a matching 
version of this dependency when we upgrade netty/gRPC.

> Beam Dependency Update Request: io.netty:netty-transport-native-epoll 
> 5.0.0.Alpha2
> --
>
> Key: BEAM-4982
> URL: https://issues.apache.org/jira/browse/BEAM-4982
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:31:42.182471
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-06 12:14:13.141909
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-13 12:15:42.691508
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-20 12:15:45.103065
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 
> 2018-08-27 12:16:18.187792
> Please review and upgrade the io.netty:netty-transport-native-epoll 
> to the latest version 5.0.0.Alpha2 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5216) BigQueryIO multi-partitioned write doesn't work for streaming writes

2018-08-24 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5216:


 Summary: BigQueryIO multi-partitioned write doesn't work for 
streaming writes
 Key: BEAM-5216
 URL: https://issues.apache.org/jira/browse/BEAM-5216
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Reporter: Chamikara Jayalath


BigQueryIO performes multi-partitioned write (MultiPartitionsWriteTables step) 
when there's more data than the quota allowed by BigQuery (10k files or 11TB of 
data) to be written to a single BQ table.

 

When writing using load jobs in streaming mode (with a triggering frequency) we 
hit following location where we set CREATE_DISPOSITION to CREATE_NEVER for all 
panes other than the first one. This is fine when we are writing a single 
partition (all panes of a window should write to the same table) but when there 
are multiple partitions this is incorrect since we need to create temp tables 
for all panes.

[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L165]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK

2018-08-23 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589895#comment-16589895
 ] 

Chamikara Jayalath commented on BEAM-5148:
--

Checkout existing source/sink tests (for example, textio_test)  for the set of 
tests that you should consider developing.

> Implement MongoDB IO for Python SDK
> ---
>
> Key: BEAM-5148
> URL: https://issues.apache.org/jira/browse/BEAM-5148
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 3.0.0
>Reporter: Pascal Gula
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Java SDK has MongoDB support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> mongoDB support will allow these runners to execute large scale jobs using it.
> Since we need this IO components @ Peat, we started working on a PyPi package 
> available at this repository: [https://github.com/PEAT-AI/beam-extended]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (BEAM-5148) Implement MongoDB IO for Python SDK

2018-08-23 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-5148:
-
Comment: was deleted

(was: Checkout existing source/sink tests (for example, textio_test)  for the 
set of tests that you should consider developing.)

> Implement MongoDB IO for Python SDK
> ---
>
> Key: BEAM-5148
> URL: https://issues.apache.org/jira/browse/BEAM-5148
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 3.0.0
>Reporter: Pascal Gula
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Java SDK has MongoDB support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> mongoDB support will allow these runners to execute large scale jobs using it.
> Since we need this IO components @ Peat, we started working on a PyPi package 
> available at this repository: [https://github.com/PEAT-AI/beam-extended]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK

2018-08-23 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589894#comment-16589894
 ] 

Chamikara Jayalath commented on BEAM-5148:
--

Checkout existing source/sink tests (for example, textio_test)  for the set of 
tests that you should consider developing.

> Implement MongoDB IO for Python SDK
> ---
>
> Key: BEAM-5148
> URL: https://issues.apache.org/jira/browse/BEAM-5148
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 3.0.0
>Reporter: Pascal Gula
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Java SDK has MongoDB support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> mongoDB support will allow these runners to execute large scale jobs using it.
> Since we need this IO components @ Peat, we started working on a PyPi package 
> available at this repository: [https://github.com/PEAT-AI/beam-extended]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK

2018-08-22 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589202#comment-16589202
 ] 

Chamikara Jayalath commented on BEAM-5148:
--

Thanks. Looks good in general. Could you send this for review in the form of a 
Beam pull request so that I can provide detailed comments. 

Also, I tried to assign this Jira to you but seems like you currently don't 
have the Beam contributor role assigned to your Jira account. Could you send a 
request through dev list or Slack for this ? (a PMC member can add you). 

> Implement MongoDB IO for Python SDK
> ---
>
> Key: BEAM-5148
> URL: https://issues.apache.org/jira/browse/BEAM-5148
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 3.0.0
>Reporter: Pascal Gula
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Java SDK has MongoDB support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> mongoDB support will allow these runners to execute large scale jobs using it.
> Since we need this IO components @ Peat, we started working on a PyPi package 
> available at this repository: [https://github.com/PEAT-AI/beam-extended]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441

2018-08-21 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588028#comment-16588028
 ] 

Chamikara Jayalath commented on BEAM-4884:
--

Passing to Sergey who wrote TikaIO which is the only user of this dependency.

> Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
> --
>
> Key: BEAM-4884
> URL: https://issues.apache.org/jira/browse/BEAM-4884
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Sergey Beryozkin
>Priority: Major
>
> 2018-07-25 20:22:10.667692
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-02 11:42:45.184687
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
> cc: 
> 2018-08-06 12:08:44.804597
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-13 12:09:05.718866
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-20 12:11:56.128242
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441

2018-08-21 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4884:


Assignee: Sergey Beryozkin  (was: Chamikara Jayalath)

> Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
> --
>
> Key: BEAM-4884
> URL: https://issues.apache.org/jira/browse/BEAM-4884
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Sergey Beryozkin
>Priority: Major
>
> 2018-07-25 20:22:10.667692
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-02 11:42:45.184687
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
> cc: 
> 2018-08-06 12:08:44.804597
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-13 12:09:05.718866
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-20 12:11:56.128242
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1

2018-08-21 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588025#comment-16588025
 ] 

Chamikara Jayalath commented on BEAM-5087:
--

Passing to Tim who wrote Kudu IO.

> Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
> -
>
> Key: BEAM-5087
> URL: https://issues.apache.org/jira/browse/BEAM-5087
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tim Robertson
>Priority: Major
>
> 2018-08-06 12:13:44.769883
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 
> 2018-08-13 12:15:08.713667
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 
> 2018-08-20 12:15:23.382955
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1

2018-08-21 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5087:


Assignee: Tim Robertson  (was: Chamikara Jayalath)

> Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
> -
>
> Key: BEAM-5087
> URL: https://issues.apache.org/jira/browse/BEAM-5087
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tim Robertson
>Priority: Major
>
> 2018-08-06 12:13:44.769883
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 
> 2018-08-13 12:15:08.713667
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 
> 2018-08-20 12:15:23.382955
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4922) Beam Dependency Update Request: org.freemarker:freemarker 2.3.28

2018-08-16 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-4922.
--
   Resolution: Fixed
Fix Version/s: 2.7.0

> Beam Dependency Update Request: org.freemarker:freemarker 2.3.28
> 
>
> Key: BEAM-4922
> URL: https://issues.apache.org/jira/browse/BEAM-4922
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> 2018-07-25 20:25:32.011218
> Please review and upgrade the org.freemarker:freemarker to the latest 
> version 2.3.28 
>  
> cc: 
> 2018-08-06 12:10:28.860355
> Please review and upgrade the org.freemarker:freemarker to the latest 
> version 2.3.28 
>  
> cc: 
> 2018-08-13 12:11:55.293042
> Please review and upgrade the org.freemarker:freemarker to the latest 
> version 2.3.28 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3

2018-08-15 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581855#comment-16581855
 ] 

Chamikara Jayalath commented on BEAM-4999:
--

I'm getting following exception when trying to build SolrIO with this 
dependency upgrade.

 

java.lang.NoClassDefFoundError: 
com/carrotsearch/randomizedtesting/generators/RandomInts at 
__randomizedtesting.SeedInfo.seed([7F6D2BEC126F6560]:0) at 
org.apache.lucene.util.TestUtil.nextInt(TestUtil.java:409) at 
org.apache.lucene.index.RandomCodec.(RandomCodec.java:120) at 
org.apache.lucene.util.TestRuleSetupAndRestoreClassEnv.before(TestRuleSetupAndRestoreClassEnv.java:189)
 at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:44)
 at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
 at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
 at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
 at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
 at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
 at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
 at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
 at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
 at java.lang.Thread.run(Thread.java:745) Caused by: 
java.lang.ClassNotFoundException: 
com.carrotsearch.randomizedtesting.generators.RandomInts at 
java.net.URLClassLoader.findClass(URLClassLoader.java:381) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 18 more

 

Cao, will you be able to look into this since you added SolrIO which uses this 
dependency ?

> Beam Dependency Update Request: 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
> -
>
> Key: BEAM-4999
> URL: https://issues.apache.org/jira/browse/BEAM-4999
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:33:34.278692
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 
> 2018-08-06 12:15:09.509698
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 
> 2018-08-13 12:16:46.379403
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3

2018-08-15 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4999:


Assignee: Cao Manh Dat  (was: Chamikara Jayalath)

> Beam Dependency Update Request: 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
> -
>
> Key: BEAM-4999
> URL: https://issues.apache.org/jira/browse/BEAM-4999
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Cao Manh Dat
>Priority: Major
>
> 2018-07-25 20:33:34.278692
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 
> 2018-08-06 12:15:09.509698
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 
> 2018-08-13 12:16:46.379403
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-08-08 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5105:


Assignee: (was: Reuven Lax)

> Move load job poll to finishBundle() method to better parallelize execution
> ---
>
> Key: BEAM-5105
> URL: https://issues.apache.org/jira/browse/BEAM-5105
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> It appears that when we write to BigQuery using WriteTablesDoFn we start a 
> load job and wait for that job to finish.
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]
>  
> In cases where we are trying to write a PCollection of tables (for example, 
> when user use dynamic destinations feature) this relies on dynamic work 
> rebalancing to parallellize execution of load jobs. If the runner does not 
> support dynamic work rebalancing or does not execute dynamic work rebalancing 
> from some reason this could have significant performance drawbacks. For 
> example, scheduling times for load jobs will add up.
>  
> A better approach might be to start load jobs at process() method but wait 
> for all load jobs to finish at finishBundle() method. This will parallelize 
> any overheads as well as job execution (assuming more than one job is 
> schedule by BQ.).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-08-08 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573562#comment-16573562
 ] 

Chamikara Jayalath commented on BEAM-5105:
--

Thanks. Unassigning from you.

> Move load job poll to finishBundle() method to better parallelize execution
> ---
>
> Key: BEAM-5105
> URL: https://issues.apache.org/jira/browse/BEAM-5105
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Reuven Lax
>Priority: Major
>
> It appears that when we write to BigQuery using WriteTablesDoFn we start a 
> load job and wait for that job to finish.
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]
>  
> In cases where we are trying to write a PCollection of tables (for example, 
> when user use dynamic destinations feature) this relies on dynamic work 
> rebalancing to parallellize execution of load jobs. If the runner does not 
> support dynamic work rebalancing or does not execute dynamic work rebalancing 
> from some reason this could have significant performance drawbacks. For 
> example, scheduling times for load jobs will add up.
>  
> A better approach might be to start load jobs at process() method but wait 
> for all load jobs to finish at finishBundle() method. This will parallelize 
> any overheads as well as job execution (assuming more than one job is 
> schedule by BQ.).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-08-07 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572535#comment-16572535
 ] 

Chamikara Jayalath commented on BEAM-5105:
--

Reuven, I might be missing drawbacks of this approach. Could you comment ?

> Move load job poll to finishBundle() method to better parallelize execution
> ---
>
> Key: BEAM-5105
> URL: https://issues.apache.org/jira/browse/BEAM-5105
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> It appears that when we write to BigQuery using WriteTablesDoFn we start a 
> load job and wait for that job to finish.
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]
>  
> In cases where we are trying to write a PCollection of tables (for example, 
> when user use dynamic destinations feature) this relies on dynamic work 
> rebalancing to parallellize execution of load jobs. If the runner does not 
> support dynamic work rebalancing or does not execute dynamic work rebalancing 
> from some reason this could have significant performance drawbacks. For 
> example, scheduling times for load jobs will add up.
>  
> A better approach might be to start load jobs at process() method but wait 
> for all load jobs to finish at finishBundle() method. This will parallelize 
> any overheads as well as job execution (assuming more than one job is 
> schedule by BQ.).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-08-07 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5105:


 Summary: Move load job poll to finishBundle() method to better 
parallelize execution
 Key: BEAM-5105
 URL: https://issues.apache.org/jira/browse/BEAM-5105
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath


It appears that when we write to BigQuery using WriteTablesDoFn we start a load 
job and wait for that job to finish.

[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]

 

In cases where we are trying to write a PCollection of tables (for example, 
when user use dynamic destinations feature) this relies on dynamic work 
rebalancing to parallellize execution of load jobs. If the runner does not 
support dynamic work rebalancing or does not execute dynamic work rebalancing 
from some reason this could have significant performance drawbacks. For 
example, scheduling times for load jobs will add up.

 

A better approach might be to start load jobs at process() method but wait for 
all load jobs to finish at finishBundle() method. This will parallelize any 
overheads as well as job execution (assuming more than one job is schedule by 
BQ.).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5105:


Assignee: Reuven Lax

> Move load job poll to finishBundle() method to better parallelize execution
> ---
>
> Key: BEAM-5105
> URL: https://issues.apache.org/jira/browse/BEAM-5105
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Reuven Lax
>Priority: Major
>
> It appears that when we write to BigQuery using WriteTablesDoFn we start a 
> load job and wait for that job to finish.
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]
>  
> In cases where we are trying to write a PCollection of tables (for example, 
> when user use dynamic destinations feature) this relies on dynamic work 
> rebalancing to parallellize execution of load jobs. If the runner does not 
> support dynamic work rebalancing or does not execute dynamic work rebalancing 
> from some reason this could have significant performance drawbacks. For 
> example, scheduling times for load jobs will add up.
>  
> A better approach might be to start load jobs at process() method but wait 
> for all load jobs to finish at finishBundle() method. This will parallelize 
> any overheads as well as job execution (assuming more than one job is 
> schedule by BQ.).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5087) Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5087:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: org.apache.kudu:kudu-client 1.7.1
> -
>
> Key: BEAM-5087
> URL: https://issues.apache.org/jira/browse/BEAM-5087
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-08-06 12:13:44.769883
> Please review and upgrade the org.apache.kudu:kudu-client to the 
> latest version 1.7.1 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4999) Beam Dependency Update Request: com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4999:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner 2.6.3
> -
>
> Key: BEAM-4999
> URL: https://issues.apache.org/jira/browse/BEAM-4999
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:33:34.278692
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 
> 2018-08-06 12:15:09.509698
> Please review and upgrade the 
> com.carrotsearch.randomizedtesting:randomizedtesting-runner to the latest 
> version 2.6.3 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4884) Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4884:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: biz.aQute:bndlib 2.0.0.20130123-133441
> --
>
> Key: BEAM-4884
> URL: https://issues.apache.org/jira/browse/BEAM-4884
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:22:10.667692
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 
> 2018-08-02 11:42:45.184687
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
> cc: 
> 2018-08-06 12:08:44.804597
> Please review and upgrade the biz.aQute:bndlib to the latest version 
> 2.0.0.20130123-133441 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4949) Beam Dependency Update Request: com.google.guava:guava 25.1-jre

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4949:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: com.google.guava:guava 25.1-jre
> ---
>
> Key: BEAM-4949
> URL: https://issues.apache.org/jira/browse/BEAM-4949
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:28:04.662092
> Please review and upgrade the com.google.guava:guava to the latest 
> version 25.1-jre 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4922) Beam Dependency Update Request: org.freemarker:freemarker 2.3.28

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4922:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: org.freemarker:freemarker 2.3.28
> 
>
> Key: BEAM-4922
> URL: https://issues.apache.org/jira/browse/BEAM-4922
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:25:32.011218
> Please review and upgrade the org.freemarker:freemarker to the latest 
> version 2.3.28 
>  
> cc: 
> 2018-08-06 12:10:28.860355
> Please review and upgrade the org.freemarker:freemarker to the latest 
> version 2.3.28 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4904) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.mongo 2.1.1

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4904:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.mongo 
> 2.1.1
> ---
>
> Key: BEAM-4904
> URL: https://issues.apache.org/jira/browse/BEAM-4904
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:23:49.911490
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.mongo to the latest version 2.1.1 
>  
> cc: 
> 2018-08-06 12:09:30.976479
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.mongo to the latest version 2.1.1 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4905) Beam Dependency Update Request: de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5

2018-08-07 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4905:


Assignee: Chamikara Jayalath

> Beam Dependency Update Request: 
> de.flapdoodle.embed:de.flapdoodle.embed.process 2.0.5
> -
>
> Key: BEAM-4905
> URL: https://issues.apache.org/jira/browse/BEAM-4905
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Chamikara Jayalath
>Priority: Major
>
> 2018-07-25 20:23:58.022170
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 
> 2018-08-06 12:09:39.047955
> Please review and upgrade the 
> de.flapdoodle.embed:de.flapdoodle.embed.process to the latest version 2.0.5 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error

2018-08-01 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5057:


 Summary: beam_Release_Gradle_NightlySnapshot failing due to a 
Javadoc error
 Key: BEAM-5057
 URL: https://issues.apache.org/jira/browse/BEAM-5057
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


[https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console]

[https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console]

 
* What went wrong:
Execution failed for task ':beam-sdks-java-core:javadoc'.
> Javadoc generation failed. Generated Javadoc options file (useful for 
> troubleshooting): 
> '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options'
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4685) Allow writing to time partitioned tables when CreateDisposition==CREATE_NEVER

2018-06-28 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4685:


 Summary: Allow writing to time partitioned tables when 
CreateDisposition==CREATE_NEVER
 Key: BEAM-4685
 URL: https://issues.apache.org/jira/browse/BEAM-4685
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath


Writing to time partitioned table fails when CreateDisposition==CREATE_NEVER 
with error "Table with field based partitioning must have a schema".

This seems to be due to BigQuery not setting schema for this case.

[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L114]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4675) Reduce the size of pretty string of BQ load jobs

2018-06-28 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-4675.
--
Resolution: Fixed

> Reduce the size of pretty string of BQ load jobs
> 
>
> Key: BEAM-4675
> URL: https://issues.apache.org/jira/browse/BEAM-4675
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.6.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some of the error logs that contain BQ load jobs can be extremely large and 
> drop the actual error message. Usually this happens due to 'schema' and/or 
> ''sourceUris' of the job configuration of load jobs being very large. I think 
> these properties are not that useful for debugging so we should consider 
> dropping them from error messages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4675) Reduce the size of pretty string of BQ load jobs

2018-06-28 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4675:


 Summary: Reduce the size of pretty string of BQ load jobs
 Key: BEAM-4675
 URL: https://issues.apache.org/jira/browse/BEAM-4675
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath
 Fix For: 2.6.0


Some of the error logs that contain BQ load jobs can be extremely large and 
drop the actual error message. Usually this happens due to 'schema' and/or 
''sourceUris' of the job configuration of load jobs being very large. I think 
these properties are not that useful for debugging so we should consider 
dropping them from error messages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4650) Add retry policy to Python BQ streaming sink

2018-06-27 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4650:


 Summary: Add retry policy to Python BQ streaming sink
 Key: BEAM-4650
 URL: https://issues.apache.org/jira/browse/BEAM-4650
 Project: Beam
  Issue Type: New Feature
  Components: sdk-py-core
Reporter: Chamikara Jayalath


Java supports specifying a retry policy when performing streaming writes to BQ: 
[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.java]

 

We should update Python BQ streaming sink to support this as well.

https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L1430



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4617) Add a dependencies guide

2018-06-22 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-4617.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Add a dependencies guide
> 
>
> Key: BEAM-4617
> URL: https://issues.apache.org/jira/browse/BEAM-4617
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Initial discussion: 
> https://lists.apache.org/thread.html/8738c13ad7e576bc2fef158d2cc6f809e1c238ab8d5164c78484bf54@%3Cdev.beam.apache.org%3E
> Vote: 
> https://lists.apache.org/thread.html/8b9b3768adfc40d3527d1ce5e8a51d90e5782a348a3abfb9e5dc85ef@%3Cdev.beam.apache.org%3E
> Doc: 
> https://docs.google.com/document/d/15m1MziZ5TNd9rh_XN0YYBJfYkt0Oj-Ou9g0KFDPL2aA/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4617) Add a dependencies guide

2018-06-21 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4617:


 Summary: Add a dependencies guide
 Key: BEAM-4617
 URL: https://issues.apache.org/jira/browse/BEAM-4617
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Initial discussion: 
https://lists.apache.org/thread.html/8738c13ad7e576bc2fef158d2cc6f809e1c238ab8d5164c78484bf54@%3Cdev.beam.apache.org%3E

Vote: 
https://lists.apache.org/thread.html/8b9b3768adfc40d3527d1ce5e8a51d90e5782a348a3abfb9e5dc85ef@%3Cdev.beam.apache.org%3E

Doc: 
https://docs.google.com/document/d/15m1MziZ5TNd9rh_XN0YYBJfYkt0Oj-Ou9g0KFDPL2aA/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4512) Move DataflowRunner off of Maven build files

2018-06-18 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516584#comment-16516584
 ] 

Chamikara Jayalath commented on BEAM-4512:
--

Assigning to Mark whose looking into this.

> Move DataflowRunner off of Maven build files
> 
>
> Key: BEAM-4512
> URL: https://issues.apache.org/jira/browse/BEAM-4512
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Chamikara Jayalath
>Assignee: Mark Liu
>Priority: Major
>
> Currently DataflowRunner (internally at Google) depends on Beam's Maven build 
> files. We have to move some internal build targets to use Gradle so that 
> Maven files can be deleted from Beam.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4512) Move DataflowRunner off of Maven build files

2018-06-18 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4512:


Assignee: Mark Liu  (was: Chamikara Jayalath)

> Move DataflowRunner off of Maven build files
> 
>
> Key: BEAM-4512
> URL: https://issues.apache.org/jira/browse/BEAM-4512
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Chamikara Jayalath
>Assignee: Mark Liu
>Priority: Major
>
> Currently DataflowRunner (internally at Google) depends on Beam's Maven build 
> files. We have to move some internal build targets to use Gradle so that 
> Maven files can be deleted from Beam.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4535) Python tests are failing for Windows

2018-06-11 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4535:


 Summary: Python tests are failing for Windows
 Key: BEAM-4535
 URL: https://issues.apache.org/jira/browse/BEAM-4535
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Chamikara Jayalath
Assignee: Udi Meiri


Error is:

Traceback (most recent call last):
  File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba
sedsource_test.py", line 532, in test_read_auto_pattern
    compression_type=CompressionTypes.AUTO))
  File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba
sedsource.py", line 119, in __init__
    self._validate()
  File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\options\v
alue_provider.py", line 133, in _f
    return fnc(self, *args, **kwargs)
  File "C:\Users\deft-testing-integra\python_sdk_download\apache_beam\io\fileba
sedsource.py", line 179, in _validate
    'No files found based on the file pattern %s' % pattern)
IOError: No files found based on the file pattern 
c:\windows\temp\tmpwon5_g\mytemp*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4512) Move DataflowRunner off of Maven build files

2018-06-06 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4512:


 Summary: Move DataflowRunner off of Maven build files
 Key: BEAM-4512
 URL: https://issues.apache.org/jira/browse/BEAM-4512
 Project: Beam
  Issue Type: Sub-task
  Components: runner-dataflow
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently DataflowRunner (internally at Google) depends on Beam's Maven build 
files. We have to move some internal build targets to use Gradle so that Maven 
files can be deleted from Beam.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3788) Implement a Kafka IO for Python SDK

2018-06-05 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502169#comment-16502169
 ] 

Chamikara Jayalath commented on BEAM-3788:
--

Email thread regarding this: 
https://lists.apache.org/thread.html/b806fcc2079fbec7bd7ae1dc619c3ff71e33c0aa51555e60f4081013@%3Cdev.beam.apache.org%3E

 

Doc: 
https://docs.google.com/document/d/1ogRS-e-HYYTHsXi_l2zDUUOnvfzEbub3BFkPrYIOawU/edit

> Implement a Kafka IO for Python SDK
> ---
>
> Key: BEAM-3788
> URL: https://issues.apache.org/jira/browse/BEAM-3788
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>
> This will be implemented using the Splittable DoFn framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4444) Parquet IO for Python SDK

2018-06-04 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500858#comment-16500858
 ] 

Chamikara Jayalath commented on BEAM-:
--

Hi Bruce, is this something you hope to add ? Just curious.

> Parquet IO for Python SDK
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Bruce Arctor
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Add Parquet Support for the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3098) Upgrade Java grpc version

2018-05-31 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497475#comment-16497475
 ] 

Chamikara Jayalath commented on BEAM-3098:
--

We are actively looking into this now.

> Upgrade Java grpc version
> -
>
> Key: BEAM-3098
> URL: https://issues.apache.org/jira/browse/BEAM-3098
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Solomon Duskis
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Beam Java currently depends on grpc 1.2, which was released in March.  It 
> would be great if the dependency could be update to something newer, like 
> grpc 1.7.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3098) Upgrade Java grpc version

2018-05-31 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3098:


Assignee: Chamikara Jayalath

> Upgrade Java grpc version
> -
>
> Key: BEAM-3098
> URL: https://issues.apache.org/jira/browse/BEAM-3098
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Solomon Duskis
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Beam Java currently depends on grpc 1.2, which was released in March.  It 
> would be great if the dependency could be update to something newer, like 
> grpc 1.7.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4254) Upgrade Bigtable client to 1.3

2018-05-17 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-4254:
-
Fix Version/s: (was: 2.5.0)

> Upgrade Bigtable client to 1.3
> --
>
> Key: BEAM-4254
> URL: https://issues.apache.org/jira/browse/BEAM-4254
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Chamikara Jayalath
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4254) Upgrade Bigtable client to 1.3

2018-05-17 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479877#comment-16479877
 ] 

Chamikara Jayalath commented on BEAM-4254:
--

cc: [~sduskis]

 

Looks like this requires upgrading protobuf and gRPC dependencies which is 
planned but cannot be done before 2.5.0.

> Upgrade Bigtable client to 1.3
> --
>
> Key: BEAM-4254
> URL: https://issues.apache.org/jira/browse/BEAM-4254
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Chamikara Jayalath
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts

2018-05-09 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4257:


Assignee: (was: Kenneth Knowles)

> Add error reason and table destination to BigQueryIO streaming failed inserts
> -
>
> Key: BEAM-4257
> URL: https://issues.apache.org/jira/browse/BEAM-4257
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Priority: Minor
>
> When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we 
> get a `PCollection` which is fine, but in order to properly work on 
> the errors downstream having extended information such as the `InsertError` 
> fields and the `TableReference` it was routed to would be really valuable.
>  
> My suggestion is to create a new object that contains all that information 
> and return a `PCollection` of those instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469535#comment-16469535
 ] 

Chamikara Jayalath commented on BEAM-4257:
--

Carlos, looks like you are working on this. Please contact a PMC in dev list or 
Slack to get the JIRA contributor role added to your account.

> Add error reason and table destination to BigQueryIO streaming failed inserts
> -
>
> Key: BEAM-4257
> URL: https://issues.apache.org/jira/browse/BEAM-4257
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Assignee: Kenneth Knowles
>Priority: Minor
>
> When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we 
> get a `PCollection` which is fine, but in order to properly work on 
> the errors downstream having extended information such as the `InsertError` 
> fields and the `TableReference` it was routed to would be really valuable.
>  
> My suggestion is to create a new object that contains all that information 
> and return a `PCollection` of those instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4257) Add error reason and table destination to BigQueryIO streaming failed inserts

2018-05-09 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-4257:
-
Component/s: (was: sdk-java-core)
 io-java-gcp

> Add error reason and table destination to BigQueryIO streaming failed inserts
> -
>
> Key: BEAM-4257
> URL: https://issues.apache.org/jira/browse/BEAM-4257
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Assignee: Kenneth Knowles
>Priority: Minor
>
> When using `BigQueryIO.Write` and getting `WriteResult.getFailedInserts()` we 
> get a `PCollection` which is fine, but in order to properly work on 
> the errors downstream having extended information such as the `InsertError` 
> fields and the `TableReference` it was routed to would be really valuable.
>  
> My suggestion is to create a new object that contains all that information 
> and return a `PCollection` of those instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4261) CloudBigtableIO should not try to validate runtime parameters at construction time.

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469440#comment-16469440
 ] 

Chamikara Jayalath commented on BEAM-4261:
--

Could you mention which parameters you want not to be validated ? tableId ?

If it's just a matter of machine that submitting the job not having access to 
Spanner, I think withoutValidation() is the proper solution since validation 
will work for some users.

> CloudBigtableIO should not try to validate runtime parameters at construction 
> time.
> ---
>
> Key: BEAM-4261
> URL: https://issues.apache.org/jira/browse/BEAM-4261
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Si
>Assignee: Chamikara Jayalath
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The workaround for user is to have some default values set and override them 
> at runtime.
> One example of validating runtime parameter at construction time is 
> following, and there are could be more.
>  
>     @Override
>     public void validate() {
>       ValueProvider tableId = config.getTableId();
>       checkArgument(tableId != null && tableId.isAccessible() && 
> !tableId.get().isEmpty(),
>         "tableId was not supplied");
>     }
>  
> A reported issue on stackoverflow: 
> [https://stackoverflow.com/questions/49595921/valueprovider-type-parameters-not-getting-honored-at-the-template-execution-time]
>  
> One concern I have is that if we disable the validation at construction time, 
> how do we validate it at runtime? Ideally, users should use template 
> parameter metadata for validation, but that is optional.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3944) Convert beam_PerformanceTests_Python to use Gradle

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469273#comment-16469273
 ] 

Chamikara Jayalath commented on BEAM-3944:
--

cc:  [~altay]  [~markflyhigh]

> Convert beam_PerformanceTests_Python to use Gradle
> --
>
> Key: BEAM-3944
> URL: https://issues.apache.org/jira/browse/BEAM-3944
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4261) CloudBigtableIO should not try to validate runtime parameters at construction time.

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469260#comment-16469260
 ] 

Chamikara Jayalath commented on BEAM-4261:
--

Have you tried using 

withoutValidation() ?

> CloudBigtableIO should not try to validate runtime parameters at construction 
> time.
> ---
>
> Key: BEAM-4261
> URL: https://issues.apache.org/jira/browse/BEAM-4261
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Si
>Assignee: Chamikara Jayalath
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The workaround for user is to have some default values set and override them 
> at runtime.
> One example of validating runtime parameter at construction time is 
> following, and there are could be more.
>  
>     @Override
>     public void validate() {
>       ValueProvider tableId = config.getTableId();
>       checkArgument(tableId != null && tableId.isAccessible() && 
> !tableId.get().isEmpty(),
>         "tableId was not supplied");
>     }
>  
> A reported issue on stackoverflow: 
> [https://stackoverflow.com/questions/49595921/valueprovider-type-parameters-not-getting-honored-at-the-template-execution-time]
>  
> One concern I have is that if we disable the validation at construction time, 
> how do we validate it at runtime? Ideally, users should use template 
> parameter metadata for validation, but that is optional.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4265) Add a dead letter queue to Python streaming BigQuery sink

2018-05-09 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4265:


 Summary: Add a dead letter queue to Python streaming BigQuery sink
 Key: BEAM-4265
 URL: https://issues.apache.org/jira/browse/BEAM-4265
 Project: Beam
  Issue Type: New Feature
  Components: sdk-py-core
Reporter: Chamikara Jayalath


When writing to BigQuery using streaming writes, Java SDK supports writing 
failed records to a dead letter queue: 
[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1375]

 

This is a very useful feature for long running pipelines so we should add this 
to Python BQ sink: 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L1279



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469202#comment-16469202
 ] 

Chamikara Jayalath commented on BEAM-4264:
--

Looks like this is due to https://github.com/apache/beam/pull/4264

> Java PostCommit Spanner tests are failing due to "Instance not found"
> -
>
> Key: BEAM-4264
> URL: https://issues.apache.org/jira/browse/BEAM-4264
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Blocker
> Fix For: 2.5.0
>
>
> First failure triggered by the commit: 
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/]
>  
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/]
>  
> Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: 
> projects/apache-beam-testing/instances/mairbek-deleteme resource_type: 
> "type.googleapis.com/google.spanner.admin.instance.v1.Instance" 
> resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" 
> description: "Instance does not exist." at 
> io.grpc.Status.asRuntimeException(Status.java:540) at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
>  at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
>  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at 
> io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546)
>  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at 
> io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4263) BigQuery connector reads the table size value from a deprecated field

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469162#comment-16469162
 ] 

Chamikara Jayalath commented on BEAM-4263:
--

[~kjung520] I assume your are working on this. Please ask in Beam Slack/dev 
list so that a PMC member can assign JIRA contributor role to you.

> BigQuery connector reads the table size value from a deprecated field
> -
>
> Key: BEAM-4263
> URL: https://issues.apache.org/jira/browse/BEAM-4263
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Kenneth Jung
>Priority: Minor
>
> The BigQuery connector in the GCP IO module reads the totalBytesProcessed 
> value from a deprecated field in the job statistics:
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs]
> The non-deprecated replacement is the totalBytesProcessed field in the query 
> statistics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4263) BigQuery connector reads the table size value from a deprecated field

2018-05-09 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4263:


Assignee: (was: Chamikara Jayalath)

> BigQuery connector reads the table size value from a deprecated field
> -
>
> Key: BEAM-4263
> URL: https://issues.apache.org/jira/browse/BEAM-4263
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Kenneth Jung
>Priority: Minor
>
> The BigQuery connector in the GCP IO module reads the totalBytesProcessed 
> value from a deprecated field in the job statistics:
> [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs]
> The non-deprecated replacement is the totalBytesProcessed field in the query 
> statistics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469150#comment-16469150
 ] 

Chamikara Jayalath commented on BEAM-4264:
--

cc: [~jkff]

> Java PostCommit Spanner tests are failing due to "Instance not found"
> -
>
> Key: BEAM-4264
> URL: https://issues.apache.org/jira/browse/BEAM-4264
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Blocker
> Fix For: 2.5.0
>
>
> First failure triggered by the commit: 
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/]
>  
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/]
>  
> Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: 
> projects/apache-beam-testing/instances/mairbek-deleteme resource_type: 
> "type.googleapis.com/google.spanner.admin.instance.v1.Instance" 
> resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" 
> description: "Instance does not exist." at 
> io.grpc.Status.asRuntimeException(Status.java:540) at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
>  at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
>  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at 
> io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546)
>  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at 
> io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"

2018-05-09 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-4264:
-
Issue Type: Bug  (was: New Feature)

> Java PostCommit Spanner tests are failing due to "Instance not found"
> -
>
> Key: BEAM-4264
> URL: https://issues.apache.org/jira/browse/BEAM-4264
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Blocker
> Fix For: 2.5.0
>
>
> First failure triggered by the commit: 
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/]
>  
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/]
>  
> Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: 
> projects/apache-beam-testing/instances/mairbek-deleteme resource_type: 
> "type.googleapis.com/google.spanner.admin.instance.v1.Instance" 
> resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" 
> description: "Instance does not exist." at 
> io.grpc.Status.asRuntimeException(Status.java:540) at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
>  at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
>  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at 
> io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546)
>  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at 
> io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"

2018-05-09 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469148#comment-16469148
 ] 

Chamikara Jayalath commented on BEAM-4264:
--

Mairbek, can you fix or revert the PR ?

> Java PostCommit Spanner tests are failing due to "Instance not found"
> -
>
> Key: BEAM-4264
> URL: https://issues.apache.org/jira/browse/BEAM-4264
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Blocker
> Fix For: 2.5.0
>
>
> First failure triggered by the commit: 
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/]
>  
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/]
>  
> Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: 
> projects/apache-beam-testing/instances/mairbek-deleteme resource_type: 
> "type.googleapis.com/google.spanner.admin.instance.v1.Instance" 
> resource_name: "projects/apache-beam-testing/instances/mairbek-deleteme" 
> description: "Instance does not exist." at 
> io.grpc.Status.asRuntimeException(Status.java:540) at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
>  at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
>  at 
> com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
>  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at 
> io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431)
>  at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546)
>  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at 
> io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ... 1 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4264) Java PostCommit Spanner tests are failing due to "Instance not found"

2018-05-09 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4264:


 Summary: Java PostCommit Spanner tests are failing due to 
"Instance not found"
 Key: BEAM-4264
 URL: https://issues.apache.org/jira/browse/BEAM-4264
 Project: Beam
  Issue Type: New Feature
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Mairbek Khadikov
 Fix For: 2.5.0


First failure triggered by the commit: 
[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/]

 

[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/316/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerWriteIT/testWrite/]

 

Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Instance not found: 
projects/apache-beam-testing/instances/mairbek-deleteme resource_type: 
"type.googleapis.com/google.spanner.admin.instance.v1.Instance" resource_name: 
"projects/apache-beam-testing/instances/mairbek-deleteme" description: 
"Instance does not exist." at 
io.grpc.Status.asRuntimeException(Status.java:540) at 
io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:439) at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
 at 
com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
 at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:56)
 at 
com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
 at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:428) at 
io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76) at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:514)
 at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:431)
 at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:546)
 at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52) at 
io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:152)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
... 1 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3516) SpannerWriteGroupFn does not respect mutation limits

2018-05-07 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3516:


Assignee: Mairbek Khadikov  (was: Chamikara Jayalath)

> SpannerWriteGroupFn does not respect mutation limits
> 
>
> Key: BEAM-3516
> URL: https://issues.apache.org/jira/browse/BEAM-3516
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
>Reporter: Ryan Gordon
>Assignee: Mairbek Khadikov
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When using SpannerIO.write(), if it happens to be a large batch or a table 
> with indexes its very possible it can hit the Spanner Mutations Limitation 
> and fail with the following error:
> {quote}Jan 02, 2018 2:42:59 PM 
> org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
> SEVERE: 2018-01-02T22:42:57.873Z: (3e7c871d215e890b): 
> com.google.cloud.spanner.SpannerException: INVALID_ARGUMENT: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The transaction contains 
> too many mutations. Insert and update operations count with the multiplicity 
> of the number of columns they affect. For example, inserting values into one 
> key column and four non-key columns count as five mutations total for the 
> insert. Delete and delete range operations count as one mutation regardless 
> of the number of columns affected. The total mutation count includes any 
> changes to indexes that the transaction generates. Please reduce the number 
> of writes, or use fewer indexes. (Maximum number: 2)
> links {
>  description: "Cloud Spanner limits documentation."
>  url: "https://cloud.google.com/spanner/docs/limits";
> }
> at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:119)
>  at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:43)
>  at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:80)
>  at 
> com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.get(GrpcSpannerRpc.java:404)
>  at 
> com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.commit(GrpcSpannerRpc.java:376)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:729)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:726)
>  at com.google.cloud.spanner.SpannerImpl.runWithRetries(SpannerImpl.java:200)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl.writeAtLeastOnce(SpannerImpl.java:725)
>  at 
> com.google.cloud.spanner.SessionPool$PooledSession.writeAtLeastOnce(SessionPool.java:248)
>  at 
> com.google.cloud.spanner.DatabaseClientImpl.writeAtLeastOnce(DatabaseClientImpl.java:37)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.flushBatch(SpannerWriteGroupFn.java:108)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.processElement(SpannerWriteGroupFn.java:79)
> {quote}
>  
> As a workaround we can override the "withBatchSizeBytes" to something much 
> smaller:
> {quote}mutations.apply("Write", SpannerIO
>    .write()
>    // Artificially reduce the max batch size b/c the batcher currently doesn't
>    // take into account the 2 mutation multiplicity limit
>    .withBatchSizeBytes(1024) // 1KB
>    .withProjectId("#PROJECTID#")
>    .withInstanceId("#INSTANCE#")
>    .withDatabaseId("#DATABASE#")
>  );
> {quote}
> While this is not as efficient, it at least allows it to work consistently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4248) Upgrade Bigquery to com.google.cloud library

2018-05-07 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-4248:


Assignee: (was: Chamikara Jayalath)

> Upgrade Bigquery to com.google.cloud library
> 
>
> Key: BEAM-4248
> URL: https://issues.apache.org/jira/browse/BEAM-4248
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Andrew Pilloud
>Priority: Major
>
> Bigquery is using the really old com.google.api.services client library. We 
> should upgrade to the com.google.cloud version which includes new features 
> and ENUMs for all the constants.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4244) Provide a better way for programmatically handling errors raised while encoding/decoding data

2018-05-06 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4244:


 Summary: Provide a better way for programmatically handling errors 
raised while encoding/decoding data
 Key: BEAM-4244
 URL: https://issues.apache.org/jira/browse/BEAM-4244
 Project: Beam
  Issue Type: New Feature
  Components: beam-model, runner-core
Reporter: Chamikara Jayalath


Beam runners use coders in various stages of a pipeline to encode/decode data. 
Coders are executed directly by the runner of a pipeline and user do not have 
control over exceptions raised during encoding/decoding (could be either due to 
malformed/corrupted data provided by users or intermediate malformed/corrupted 
data generated during the system execution).

Currently users can rely on runner-specific worker logging to detect the error 
and update the pipeline but it would be better if we can provide a way to 
programmatically handle these errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-05-02 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-3973:
-
Priority: Blocker  (was: Major)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Blocker
> Fix For: 2.5.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-05-02 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461708#comment-16461708
 ] 

Chamikara Jayalath commented on BEAM-3973:
--

I think this is a 2.5.0 blocker. Mairbek, can you confirm ?

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   >