Re: Cannot create subscription because pipeline option 'project' not specified

2018-07-26 Thread Andrew Pilloud
Your second stacktrace isn't going through SQL. It looks like you are using
the normal test path there. Have you tried setting in both places?

On Thu, Jul 26, 2018, 5:48 PM Rui Wang  wrote:

> Ah,  SET project = apache-beam-testing; gives the following exception:
>
> io.grpc.StatusRuntimeException: PERMISSION_DENIED: User not authorized to 
> perform this action.
>   at 
> io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222)
>   at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203)
>   at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132)
>   at 
> com.google.pubsub.v1.PublisherGrpc$PublisherBlockingStub.createTopic(PublisherGrpc.java:666)
>   at 
> org.apache.beam.sdk.io.gcp.pubsub.PubsubGrpcClient.createTopic(PubsubGrpcClient.java:302)
>   at 
> org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.initializePubsub(TestPubsub.java:102)
>   at 
> org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.access$200(TestPubsub.java:42)
>   at 
> org.apache.beam.sdk.io.gcp.pubsub.TestPubsub$1.evaluate(TestPubsub.java:85)
>
>
> which should not happen because jenkins should already have credentials to
> access GCP.
>
> -Rui
>
> On Thu, Jul 26, 2018 at 3:30 PM Andrew Pilloud 
> wrote:
>
>> Beam SQL CLI does not accept beamTestPipelineOptions. Also, gradle is
>> invoking your test not the Beam SQL CLI. You'll need to set the options in
>> your integration test by executing 'SET project = ...' in the Beam SQL
>> connection you've launched for test.
>>
>> Andrew
>>
>> On Thu, Jul 26, 2018 at 3:06 PM Rui Wang  wrote:
>>
>>> The code path of reading pubsub through BeamSQL goes through this line
>>> of code:
>>>
>>> PubsubIO.Read read = 
>>> PubsubIO.readMessagesWithAttributes().fromTopic(getTopic());
>>>
>>> -Rui
>>>
>>> On Thu, Jul 26, 2018 at 2:58 PM Rui Wang  wrote:
>>>
 Hi Community,

 I am facing a runtime exception when I try to read from pubsub by Beam
 SQL in JUnit tests (PR: https://github.com/apache/beam/pull/6006). The
 exception is "Cannot create subscription because pipeline option 'project'
 not specified". Based on existing JUnit tests which also read from
 pubsub by Beam SQL, I added the following code to my build.gradle and
 it seems didn't work. Is there someone who could know what's wrong in my
 .gradle file?


 task endToEndTest(type: Test) {
   group = "Verification"
   def gcpProject = project.findProperty('gcpProject') ?: 
 'apache-beam-testing'
   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
 'gs://temp-storage-for-end-to-end-tests/'

   // Disable Gradle cache (it should not be used because the IT's won't 
 run).
   outputs.upToDateWhen { false }

   def pipelineOptions = [
   "--project=${gcpProject}",
   "--tempLocation=${gcsTempRoot}",
   "--blockOnRun=false"]

   systemProperty "beamTestPipelineOptions", 
 JsonOutput.toJson(pipelineOptions)

   include '**/BeamSqlLineIT.class'
   classpath = 
 project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.runtimeClasspath
   testClassesDirs = 
 files(project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.output.classesDirs)
   useJUnit { }
 }

 task postCommit {
   group = "Verification"
   description = "Various integration tests"
   dependsOn endToEndTest
 }


 -Rui

>>>


Re: Cannot create subscription because pipeline option 'project' not specified

2018-07-26 Thread Rui Wang
Ah,  SET project = apache-beam-testing; gives the following exception:

io.grpc.StatusRuntimeException: PERMISSION_DENIED: User not authorized
to perform this action.
at 
io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132)
at 
com.google.pubsub.v1.PublisherGrpc$PublisherBlockingStub.createTopic(PublisherGrpc.java:666)
at 
org.apache.beam.sdk.io.gcp.pubsub.PubsubGrpcClient.createTopic(PubsubGrpcClient.java:302)
at 
org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.initializePubsub(TestPubsub.java:102)
at 
org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.access$200(TestPubsub.java:42)
at 
org.apache.beam.sdk.io.gcp.pubsub.TestPubsub$1.evaluate(TestPubsub.java:85)


which should not happen because jenkins should already have credentials to
access GCP.

-Rui

On Thu, Jul 26, 2018 at 3:30 PM Andrew Pilloud  wrote:

> Beam SQL CLI does not accept beamTestPipelineOptions. Also, gradle is
> invoking your test not the Beam SQL CLI. You'll need to set the options in
> your integration test by executing 'SET project = ...' in the Beam SQL
> connection you've launched for test.
>
> Andrew
>
> On Thu, Jul 26, 2018 at 3:06 PM Rui Wang  wrote:
>
>> The code path of reading pubsub through BeamSQL goes through this line of
>> code:
>>
>> PubsubIO.Read read = 
>> PubsubIO.readMessagesWithAttributes().fromTopic(getTopic());
>>
>> -Rui
>>
>> On Thu, Jul 26, 2018 at 2:58 PM Rui Wang  wrote:
>>
>>> Hi Community,
>>>
>>> I am facing a runtime exception when I try to read from pubsub by Beam
>>> SQL in JUnit tests (PR: https://github.com/apache/beam/pull/6006). The
>>> exception is "Cannot create subscription because pipeline option 'project'
>>> not specified". Based on existing JUnit tests which also read from
>>> pubsub by Beam SQL, I added the following code to my build.gradle and
>>> it seems didn't work. Is there someone who could know what's wrong in my
>>> .gradle file?
>>>
>>>
>>> task endToEndTest(type: Test) {
>>>   group = "Verification"
>>>   def gcpProject = project.findProperty('gcpProject') ?: 
>>> 'apache-beam-testing'
>>>   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
>>> 'gs://temp-storage-for-end-to-end-tests/'
>>>
>>>   // Disable Gradle cache (it should not be used because the IT's won't 
>>> run).
>>>   outputs.upToDateWhen { false }
>>>
>>>   def pipelineOptions = [
>>>   "--project=${gcpProject}",
>>>   "--tempLocation=${gcsTempRoot}",
>>>   "--blockOnRun=false"]
>>>
>>>   systemProperty "beamTestPipelineOptions", 
>>> JsonOutput.toJson(pipelineOptions)
>>>
>>>   include '**/BeamSqlLineIT.class'
>>>   classpath = 
>>> project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.runtimeClasspath
>>>   testClassesDirs = 
>>> files(project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.output.classesDirs)
>>>   useJUnit { }
>>> }
>>>
>>> task postCommit {
>>>   group = "Verification"
>>>   description = "Various integration tests"
>>>   dependsOn endToEndTest
>>> }
>>>
>>>
>>> -Rui
>>>
>>


Re: Cannot create subscription because pipeline option 'project' not specified

2018-07-26 Thread Andrew Pilloud
Beam SQL CLI does not accept beamTestPipelineOptions. Also, gradle is
invoking your test not the Beam SQL CLI. You'll need to set the options in
your integration test by executing 'SET project = ...' in the Beam SQL
connection you've launched for test.

Andrew

On Thu, Jul 26, 2018 at 3:06 PM Rui Wang  wrote:

> The code path of reading pubsub through BeamSQL goes through this line of
> code:
>
> PubsubIO.Read read = 
> PubsubIO.readMessagesWithAttributes().fromTopic(getTopic());
>
> -Rui
>
> On Thu, Jul 26, 2018 at 2:58 PM Rui Wang  wrote:
>
>> Hi Community,
>>
>> I am facing a runtime exception when I try to read from pubsub by Beam
>> SQL in JUnit tests (PR: https://github.com/apache/beam/pull/6006). The
>> exception is "Cannot create subscription because pipeline option 'project'
>> not specified". Based on existing JUnit tests which also read from
>> pubsub by Beam SQL, I added the following code to my build.gradle and it
>> seems didn't work. Is there someone who could know what's wrong in my
>> .gradle file?
>>
>>
>> task endToEndTest(type: Test) {
>>   group = "Verification"
>>   def gcpProject = project.findProperty('gcpProject') ?: 
>> 'apache-beam-testing'
>>   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
>> 'gs://temp-storage-for-end-to-end-tests/'
>>
>>   // Disable Gradle cache (it should not be used because the IT's won't run).
>>   outputs.upToDateWhen { false }
>>
>>   def pipelineOptions = [
>>   "--project=${gcpProject}",
>>   "--tempLocation=${gcsTempRoot}",
>>   "--blockOnRun=false"]
>>
>>   systemProperty "beamTestPipelineOptions", 
>> JsonOutput.toJson(pipelineOptions)
>>
>>   include '**/BeamSqlLineIT.class'
>>   classpath = 
>> project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.runtimeClasspath
>>   testClassesDirs = 
>> files(project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.output.classesDirs)
>>   useJUnit { }
>> }
>>
>> task postCommit {
>>   group = "Verification"
>>   description = "Various integration tests"
>>   dependsOn endToEndTest
>> }
>>
>>
>> -Rui
>>
>


Re: Cannot create subscription because pipeline option 'project' not specified

2018-07-26 Thread Rui Wang
The code path of reading pubsub through BeamSQL goes through this line of
code:

PubsubIO.Read read =
PubsubIO.readMessagesWithAttributes().fromTopic(getTopic());

-Rui

On Thu, Jul 26, 2018 at 2:58 PM Rui Wang  wrote:

> Hi Community,
>
> I am facing a runtime exception when I try to read from pubsub by Beam SQL
> in JUnit tests (PR: https://github.com/apache/beam/pull/6006). The
> exception is "Cannot create subscription because pipeline option 'project'
> not specified". Based on existing JUnit tests which also read from pubsub
> by Beam SQL, I added the following code to my build.gradle and it seems
> didn't work. Is there someone who could know what's wrong in my .gradle
> file?
>
>
> task endToEndTest(type: Test) {
>   group = "Verification"
>   def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
>   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
> 'gs://temp-storage-for-end-to-end-tests/'
>
>   // Disable Gradle cache (it should not be used because the IT's won't run).
>   outputs.upToDateWhen { false }
>
>   def pipelineOptions = [
>   "--project=${gcpProject}",
>   "--tempLocation=${gcsTempRoot}",
>   "--blockOnRun=false"]
>
>   systemProperty "beamTestPipelineOptions", JsonOutput.toJson(pipelineOptions)
>
>   include '**/BeamSqlLineIT.class'
>   classpath = 
> project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.runtimeClasspath
>   testClassesDirs = 
> files(project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.output.classesDirs)
>   useJUnit { }
> }
>
> task postCommit {
>   group = "Verification"
>   description = "Various integration tests"
>   dependsOn endToEndTest
> }
>
>
> -Rui
>


Cannot create subscription because pipeline option 'project' not specified

2018-07-26 Thread Rui Wang
Hi Community,

I am facing a runtime exception when I try to read from pubsub by Beam SQL
in JUnit tests (PR: https://github.com/apache/beam/pull/6006). The
exception is "Cannot create subscription because pipeline option 'project'
not specified". Based on existing JUnit tests which also read from pubsub
by Beam SQL, I added the following code to my build.gradle and it seems
didn't work. Is there someone who could know what's wrong in my .gradle
file?


task endToEndTest(type: Test) {
  group = "Verification"
  def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
  def gcsTempRoot = project.findProperty('gcsTempRoot') ?:
'gs://temp-storage-for-end-to-end-tests/'

  // Disable Gradle cache (it should not be used because the IT's won't run).
  outputs.upToDateWhen { false }

  def pipelineOptions = [
  "--project=${gcpProject}",
  "--tempLocation=${gcsTempRoot}",
  "--blockOnRun=false"]

  systemProperty "beamTestPipelineOptions", JsonOutput.toJson(pipelineOptions)

  include '**/BeamSqlLineIT.class'
  classpath = 
project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.runtimeClasspath
  testClassesDirs =
files(project(":beam-sdks-java-extensions-sql-jdbc").sourceSets.test.output.classesDirs)
  useJUnit { }
}

task postCommit {
  group = "Verification"
  description = "Various integration tests"
  dependsOn endToEndTest
}


-Rui


Re: [PROPOSAL] Prepare Beam 2.6.0 release

2018-07-26 Thread Pablo Estrada
Hello everyone,
I wanted to do an update on the state of the release, as there haven't been
news on this for a while.
We have found a few issues that broke postcommits a few weeks back, but we
hadn't noticed. Some people are tacking these to try to stabilize the
release branch[1].

In the meantime, the release has been blocked, but Boyuan Zhang has taken
advantage of this to code up a few scripts to try and automate release
steps. (Thanks Boyuan!). We will try these as soon as the release is
unblocked.

Best
-P.

[1] https://github.com/apache/beam/pull/6072

On Wed, Jul 18, 2018 at 11:03 AM Pablo Estrada  wrote:

> Hello all!
> I've cut the release branch (release-2.6.0), with some help from Ahmet and
> Boyuan. From now on, please cherry-pick 2.6.0 blockers into the branch.
> Now we start stabilizing it.
>
> Thanks!
>
> -P.
>
> On Tue, Jul 17, 2018 at 9:34 PM Jean-Baptiste Onofré 
> wrote:
>
>> Hi Pablo,
>>
>> I'm investigating this issue, but it's a little long process.
>>
>> So, I propose you start with the release process,  cutting the branch,
>> and then, I will create a cherry-pick PR for this one.
>>
>> Regards
>> JB
>>
>> On 17/07/2018 20:19, Pablo Estrada wrote:
>> > Checking once more:
>> > What does the communitythink we should do
>> > about https://issues.apache.org/jira/browse/BEAM-4750 ? Should I bump
>> it
>> > to 2.7.0?
>> > Best
>> > -P.
>> >
>> > On Fri, Jul 13, 2018 at 5:15 PM Ahmet Altay > > > wrote:
>> >
>> > Update:  https://issues.apache.org/jira/browse/BEAM-4784 is not a
>> > release blocker, details in the JIRA issue.
>> >
>> > On Fri, Jul 13, 2018 at 11:12 AM, Thomas Weise > > > wrote:
>> >
>> > Can one of our Python experts please take a look
>> > at https://issues.apache.org/jira/browse/BEAM-4784 and advise
>> if
>> > this should be addressed for the release?
>> >
>> > Thanks,
>> > Thomas
>> >
>> >
>> > On Fri, Jul 13, 2018 at 11:02 AM Ahmet Altay > > > wrote:
>> >
>> >
>> >
>> > On Fri, Jul 13, 2018 at 10:48 AM, Pablo Estrada
>> > mailto:pabl...@google.com>> wrote:
>> >
>> > Hi all,
>> > I've triaged most issues marked for 2.6.0 release. I've
>> > localized two that need a decision / attention:
>> >
>> > - https://issues.apache.org/jira/browse/BEAM-4417 -
>> > Bigquery IO Numeric Datatype Support. Cham is not
>> > available to fix this at the moment, but this is a
>> > critical issue. Is anyone able to tackle this / should
>> > we bump this to next release?
>> >
>> >
>> > I bumped this to the next release. I think Cham will be the
>> > best person to address it when he is back. And with the
>> > regular release cadence, it would not be delayed by much.
>> >
>> >
>> >
>> > - https://issues.apache.org/jira/browse/BEAM-4750 -
>> > Performance degradation due to some safeguards in
>> > beam-sdks-java-core. JB, are you looking to fix this?
>> > Should we bump? I had the impression that it was an easy
>> > fix, but I'm not sure.
>> >
>> > If you're aware of any other issue that needs to be
>> > included as a release blocker, please report it to me.
>> > Best
>> > -P.
>> >
>> > On Thu, Jul 12, 2018 at 2:15 AM Etienne Chauchot
>> > mailto:echauc...@apache.org>>
>> wrote:
>> >
>> > +1,
>> >
>> > Thanks for volunteering Pablo, thanks also to have
>> > caught tickets that I forgot to close :)
>> >
>> > Etienne
>> >
>> > Le mercredi 11 juillet 2018 à 12:55 -0700, Alan
>> > Myrvold a écrit :
>> >> +1 Thanks for volunteering, Pablo
>> >>
>> >> On Wed, Jul 11, 2018 at 11:49 AM Jason Kuster
>> >> > >> > wrote:
>> >>> +1 sounds great
>> >>>
>> >>> On Wed, Jul 11, 2018 at 11:06 AM Thomas Weise
>> >>> mailto:t...@apache.org>> wrote:
>>  +1
>> 
>>  Thanks for volunteering, Pablo!
>> 
>>  On Mon, Jul 9, 2018 at 9:56 PM Jean-Baptiste
>>  Onofré >  > wrote:
>> > +1
>> >
>> > I planned to send the proposal as well ;)
>> >
>> > Regards
>> > JB
>> >
>> > On 09/07/2018 23:16, Pablo Estrada wrote:
>> > > Hello 

Re: FileBasedSink.WriteOperation copy instead of move?

2018-07-26 Thread Chamikara Jayalath
Yeah, please file a JIRA.

- Cham

On Thu, Jul 26, 2018 at 11:33 AM Jozef Vilcek  wrote:

> Yes, rename can be tricky with cross-directory. This is related
> https://issues.apache.org/jira/browse/BEAM-4861
> I guess I can file a JIRA for this, right?
>
> On Thu, Jul 26, 2018 at 7:31 PM Chamikara Jayalath 
> wrote:
>
>> Also, we'll have to use StandardMoveOptions.IGNORE_MISSING_FILES for
>> supporting failures of the rename step. I think this is a good change to do
>> if the change significantly improves the performance of some of the
>> FileSystems (note that some FileSystems, for example GCS, implement rename
>> in the form of a copy+delete, so there will be no significant performance
>> improvements for such FileSystems).
>>
>> -Cham
>>
>> On Thu, Jul 26, 2018 at 10:14 AM Reuven Lax  wrote:
>>
>>> We might be able to replace this with Filesystem.rename(). One thing to
>>> keep in mind - the destination files might be in a different directory, so
>>> we would need to make sure that all Filesystems support cross-directory
>>> rename.
>>>
>>> On Thu, Jul 26, 2018 at 9:58 AM Lukasz Cwik  wrote:
>>>
 +dev

 On Thu, Jul 26, 2018 at 2:40 AM Jozef Vilcek 
 wrote:

> Hello,
>
> just came across FileBasedSink.WriteOperation class which does have
> moveToOutput() method. Implementation does a Filesystem.copy() instead of
> "move". With large files I find it quote no efficient if underlying FS
> supports more efficient ways, so I wonder what is the story behind it? 
> Must
> it be a copy?
>
>
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L761
>



Re: FileBasedSink.WriteOperation copy instead of move?

2018-07-26 Thread Jozef Vilcek
Yes, rename can be tricky with cross-directory. This is related
https://issues.apache.org/jira/browse/BEAM-4861
I guess I can file a JIRA for this, right?

On Thu, Jul 26, 2018 at 7:31 PM Chamikara Jayalath 
wrote:

> Also, we'll have to use StandardMoveOptions.IGNORE_MISSING_FILES for
> supporting failures of the rename step. I think this is a good change to do
> if the change significantly improves the performance of some of the
> FileSystems (note that some FileSystems, for example GCS, implement rename
> in the form of a copy+delete, so there will be no significant performance
> improvements for such FileSystems).
>
> -Cham
>
> On Thu, Jul 26, 2018 at 10:14 AM Reuven Lax  wrote:
>
>> We might be able to replace this with Filesystem.rename(). One thing to
>> keep in mind - the destination files might be in a different directory, so
>> we would need to make sure that all Filesystems support cross-directory
>> rename.
>>
>> On Thu, Jul 26, 2018 at 9:58 AM Lukasz Cwik  wrote:
>>
>>> +dev
>>>
>>> On Thu, Jul 26, 2018 at 2:40 AM Jozef Vilcek 
>>> wrote:
>>>
 Hello,

 just came across FileBasedSink.WriteOperation class which does have
 moveToOutput() method. Implementation does a Filesystem.copy() instead of
 "move". With large files I find it quote no efficient if underlying FS
 supports more efficient ways, so I wonder what is the story behind it? Must
 it be a copy?


 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L761

>>>


Re: FileBasedSink.WriteOperation copy instead of move?

2018-07-26 Thread Chamikara Jayalath
Also, we'll have to use StandardMoveOptions.IGNORE_MISSING_FILES for
supporting failures of the rename step. I think this is a good change to do
if the change significantly improves the performance of some of the
FileSystems (note that some FileSystems, for example GCS, implement rename
in the form of a copy+delete, so there will be no significant performance
improvements for such FileSystems).

-Cham

On Thu, Jul 26, 2018 at 10:14 AM Reuven Lax  wrote:

> We might be able to replace this with Filesystem.rename(). One thing to
> keep in mind - the destination files might be in a different directory, so
> we would need to make sure that all Filesystems support cross-directory
> rename.
>
> On Thu, Jul 26, 2018 at 9:58 AM Lukasz Cwik  wrote:
>
>> +dev
>>
>> On Thu, Jul 26, 2018 at 2:40 AM Jozef Vilcek 
>> wrote:
>>
>>> Hello,
>>>
>>> just came across FileBasedSink.WriteOperation class which does have
>>> moveToOutput() method. Implementation does a Filesystem.copy() instead of
>>> "move". With large files I find it quote no efficient if underlying FS
>>> supports more efficient ways, so I wonder what is the story behind it? Must
>>> it be a copy?
>>>
>>>
>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L761
>>>
>>


Re: FileBasedSink.WriteOperation copy instead of move?

2018-07-26 Thread Lukasz Cwik
+dev

On Thu, Jul 26, 2018 at 2:40 AM Jozef Vilcek  wrote:

> Hello,
>
> just came across FileBasedSink.WriteOperation class which does have
> moveToOutput() method. Implementation does a Filesystem.copy() instead of
> "move". With large files I find it quote no efficient if underlying FS
> supports more efficient ways, so I wonder what is the story behind it? Must
> it be a copy?
>
>
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L761
>