Build failed in Jenkins: beam_PostRelease_NightlySnapshot #136

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[yifanzou] BEAM-3339 Mobile gaming automation for Java nightly snapshot

--
[...truncated 3.34 MB...]
  "code" : 404,
  "errors" : [ {
"domain" : "global",
"message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
"reason" : "notFound"
  } ],
  "message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
  "status" : "NOT_FOUND"
}

com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)

com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)

com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)

org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$0(BigQueryServicesImpl.java:714)
java.util.concurrent.FutureTask.run(FutureTask.java:266)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
java.lang.RuntimeException: 
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
  "code" : 404,
  "errors" : [ {
"domain" : "global",
"message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
"reason" : "notFound"
  } ],
  "message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
  "status" : "NOT_FOUND"
}

org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:767)

org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:802)

org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:121)

org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:96)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 
404 Not Found
{
  "code" : 404,
  "errors" : [ {
"domain" : "global",
"message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
"reason" : "notFound"
  } ],
  "message" : "Not found: Table 
apache-beam-testing:beam_postrelease_mobile_gaming.leaderboard_DataflowRunner_team",
  "status" : "NOT_FOUND"
}

com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)

com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)

com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)

com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)

org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.lambda$insertAll$0(BigQueryServicesImpl.java:714)
java.util.concurrent.FutureTask.run(FutureTask.java:266)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
java.lang.RuntimeException: 
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
  "code" 

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81081
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 05:31
Start Date: 16/Mar/18 05:31
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373608021
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81081)
Time Spent: 66h 10m  (was: 66h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 66h 10m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostRelease_NightlySnapshot #135

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3571) Validate Go SDK encodes values the same as other SDKs

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3571?focusedWorklogId=81069=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81069
 ]

ASF GitHub Bot logged work on BEAM-3571:


Author: ASF GitHub Bot
Created on: 16/Mar/18 05:03
Start Date: 16/Mar/18 05:03
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #4843:  [BEAM-3571] Correct 
the Go SDK's EventTime encoding
URL: https://github.com/apache/beam/pull/4843#issuecomment-373604403
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81069)
Time Spent: 2h  (was: 1h 50m)

> Validate Go SDK encodes values the same as other SDKs
> -
>
> Key: BEAM-3571
> URL: https://issues.apache.org/jira/browse/BEAM-3571
> Project: Beam
>  Issue Type: Task
>  Components: sdk-go
>Reporter: Robert Burke
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> EvenTime, windows, length prefixing, etc encodings and decodings should be 
> validated to match the other SDKs. This issue is to sanity check for and 
> resolve these differences when the are found.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81068=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81068
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 05:01
Start Date: 16/Mar/18 05:01
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373604097
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81068)
Time Spent: 66h  (was: 65h 50m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 66h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostRelease_NightlySnapshot #134

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[yifanzou] BEAM-3339 Mobile gaming automation for Java nightly snapshot

--
[...truncated 1023.55 KB...]

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (31s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (32s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (33s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (34s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (35s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (36s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (37s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (38s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (39s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (40s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (41s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (42s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (43s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (44s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (45s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (46s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (47s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (48s) Current status: 
PENDING

   
..DELAY(339961,
 1)
{timestamp_ms=1521175315000}
late data for: user4_BananaNumbat,BananaNumbat,10,1521175315000,2018-03-15 
21:47:35.906
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (49s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (50s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (51s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (52s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (53s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (54s) Current status: 
PENDING

   
Waiting on bqjob_r66cfd65d6b8fdcdb_01622d21afe6_1 ... (55s) Current status: 
PENDING
   

Build failed in Jenkins: beam_PostRelease_NightlySnapshot #133

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 219.32 KB...]
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (20s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (21s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (22s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (23s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (24s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (25s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (26s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (27s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (28s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (29s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (30s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (31s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (32s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (33s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (34s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (35s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (36s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (37s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (38s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (39s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (40s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (41s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (42s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (43s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (44s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (45s) Current status: 
PENDING

   
Waiting on bqjob_r54109d94b4df4eba_01622cfc9fb1_1 ... (46s) Current status: 
PENDING

   
Waiting on 

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81058
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 04:22
Start Date: 16/Mar/18 04:22
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373599490
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81058)
Time Spent: 65h 50m  (was: 65h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostRelease_NightlySnapshot #132

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[yifanzou] BEAM-3339 Mobile gaming automation for Java nightly snapshot

--
[...truncated 1.18 MB...]
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (48s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (49s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (50s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (51s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (52s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (53s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (54s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (55s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (56s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (57s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (58s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (59s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (60s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (61s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (62s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (63s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (64s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (65s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (66s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (67s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (68s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (69s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (70s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (71s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (72s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (73s) Current status: 
PENDING

   
Waiting on bqjob_r57c183232e8d1bc9_01622ce8926b_1 ... (74s) Current status: 
PENDING
 

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81056
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 03:52
Start Date: 16/Mar/18 03:52
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373595857
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81056)
Time Spent: 65h 40m  (was: 65.5h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65h 40m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81055
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 03:33
Start Date: 16/Mar/18 03:33
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r174989948
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -518,24 +518,33 @@ ext.applyAvroNature = {
 
 // A class defining the set of configurable properties for 
createJavaQuickstartValidationTask
 class JavaQuickstartConfiguration {
-  // Name for the quickstart is required.
-  // Used both for the test name runQuickstartJava${name}
-  // and also for the script name, quickstart-java-${name}.toLowerCase().
-  String name
+  // Type [Quickstart, MobileGaming] for the postrelease validation is 
required.
+  // Used both for the test name run${type}Java${runner}
+  // and also for the script name, ${type}-java-${runner}.toLowerCase().
+  String type
 
-  // gcpProject sets the gcpProject argument when executing the quickstart.
+  // runner [Direct, Dataflow, Spark, Flink, FlinkLocal, Apex]
+  String runner
+
+  // gcpProject sets the gcpProject argument when executing examples.
 
 Review comment:
   They are used by both Quickstart and MobileGaming except for "dataset", 
which is MobileGaming specific.
   
   "type", "runner" -- required;
   "gcpProject", "gcsBucket" -- used by examples cooperate with gcloud;
   "pubsubTopic" -- required by MobileGaming at this moment, but it's also good 
to have it since we are going to automate streaming wordcount examples in the 
future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81055)
Time Spent: 65.5h  (was: 65h 20m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65.5h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81054
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 03:32
Start Date: 16/Mar/18 03:32
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r174989948
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -518,24 +518,33 @@ ext.applyAvroNature = {
 
 // A class defining the set of configurable properties for 
createJavaQuickstartValidationTask
 class JavaQuickstartConfiguration {
-  // Name for the quickstart is required.
-  // Used both for the test name runQuickstartJava${name}
-  // and also for the script name, quickstart-java-${name}.toLowerCase().
-  String name
+  // Type [Quickstart, MobileGaming] for the postrelease validation is 
required.
+  // Used both for the test name run${type}Java${runner}
+  // and also for the script name, ${type}-java-${runner}.toLowerCase().
+  String type
 
-  // gcpProject sets the gcpProject argument when executing the quickstart.
+  // runner [Direct, Dataflow, Spark, Flink, FlinkLocal, Apex]
+  String runner
+
+  // gcpProject sets the gcpProject argument when executing examples.
 
 Review comment:
   They are used by both Quickstart and MobileGaming except for "dataset", 
which is MobileGaming specific.
   
   "type", "runner" -- required;
   "gcpProject", "gcsBucket", "pubsubTopic" -- used by examples cooperates with 
gcloud;
   "pubsubTopic" -- required by MobileGaming at this moment, but it's also good 
to have it since we are going to automate streaming wordcount examples in the 
future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81054)
Time Spent: 65h 20m  (was: 65h 10m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65h 20m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_ValidatesRunner_Dataflow #1122

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81052=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81052
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 03:28
Start Date: 16/Mar/18 03:28
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r174989948
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -518,24 +518,33 @@ ext.applyAvroNature = {
 
 // A class defining the set of configurable properties for 
createJavaQuickstartValidationTask
 class JavaQuickstartConfiguration {
-  // Name for the quickstart is required.
-  // Used both for the test name runQuickstartJava${name}
-  // and also for the script name, quickstart-java-${name}.toLowerCase().
-  String name
+  // Type [Quickstart, MobileGaming] for the postrelease validation is 
required.
+  // Used both for the test name run${type}Java${runner}
+  // and also for the script name, ${type}-java-${runner}.toLowerCase().
+  String type
 
-  // gcpProject sets the gcpProject argument when executing the quickstart.
+  // runner [Direct, Dataflow, Spark, Flink, FlinkLocal, Apex]
+  String runner
+
+  // gcpProject sets the gcpProject argument when executing examples.
 
 Review comment:
   They are used by both Quickstart and MobileGaming except for "dataset", 
which is MobileGaming specific.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81052)
Time Spent: 65h 10m  (was: 65h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65h 10m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81050
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 03:22
Start Date: 16/Mar/18 03:22
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373592072
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81050)
Time Spent: 65h  (was: 64h 50m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 65h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostRelease_NightlySnapshot #131

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 2.07 MB...]

Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (0s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (1s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (2s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (3s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (4s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (5s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (6s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (7s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (8s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (9s) Current status: 
PENDING

  
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (10s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (11s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (12s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (13s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (14s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (15s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (16s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (17s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (18s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (19s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (20s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (21s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (22s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (23s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (24s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (25s) Current status: 
PENDING

   
Waiting on bqjob_r7975a2b25ecb3b4d_01622cd0f827_1 ... (26s) Current status: 
PENDING

   
Waiting on 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5161

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Comment Edited] (BEAM-3110) The transform Read(UnboundedKafkaSource) is currently not supported

2018-03-15 Thread Ivan Li (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401410#comment-16401410
 ] 

Ivan Li edited comment on BEAM-3110 at 3/16/18 3:02 AM:


I have the same issue in my gradle beam project, if you are using 

[http://imperceptiblethoughts.com/shadow/,] this can be resolved by 
[http://imperceptiblethoughts.com/shadow/#controlling_jar_content_merging]
{code:java}
shadowJar {
 mergeServiceFiles()
}

{code}


was (Author: email2liyang):
I meat the same issue in my gradle beam project, if you are using 

[http://imperceptiblethoughts.com/shadow/,] this can be resolved by 
http://imperceptiblethoughts.com/shadow/#controlling_jar_content_merging

{code}

shadowJar {
 mergeServiceFiles()
}

{code}

> The transform Read(UnboundedKafkaSource) is currently not supported
> ---
>
> Key: BEAM-3110
> URL: https://issues.apache.org/jira/browse/BEAM-3110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
> Fix For: 2.3.0
>
>
> I see this issue when submitting a job to Flink cluster. It appears after 
> build {{2.2.0-20170912.083349-51}}.
> {code}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1129)
> Caused by: java.lang.UnsupportedOperationException: The transform 
> Read(UnboundedKafkaSource) is currently not supported.
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.visitPrimitiveTransform(FlinkStreamingPipelineTranslator.java:113)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:666)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
>   at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:451)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineTranslator.translate(FlinkPipelineTranslator.java:38)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.translate(FlinkStreamingPipelineTranslator.java:69)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:104)
>   at org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:113)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:304)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:290)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3110) The transform Read(UnboundedKafkaSource) is currently not supported

2018-03-15 Thread Ivan Li (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401410#comment-16401410
 ] 

Ivan Li commented on BEAM-3110:
---

I meat the same issue in my gradle beam project, if you are using 

[http://imperceptiblethoughts.com/shadow/,] this can be resolved by 
http://imperceptiblethoughts.com/shadow/#controlling_jar_content_merging

{code}

shadowJar {
 mergeServiceFiles()
}

{code}

> The transform Read(UnboundedKafkaSource) is currently not supported
> ---
>
> Key: BEAM-3110
> URL: https://issues.apache.org/jira/browse/BEAM-3110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
> Fix For: 2.3.0
>
>
> I see this issue when submitting a job to Flink cluster. It appears after 
> build {{2.2.0-20170912.083349-51}}.
> {code}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1129)
> Caused by: java.lang.UnsupportedOperationException: The transform 
> Read(UnboundedKafkaSource) is currently not supported.
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.visitPrimitiveTransform(FlinkStreamingPipelineTranslator.java:113)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:666)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
>   at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:451)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineTranslator.translate(FlinkPipelineTranslator.java:38)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.translate(FlinkStreamingPipelineTranslator.java:69)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:104)
>   at org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:113)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:304)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:290)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81046
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 02:57
Start Date: 16/Mar/18 02:57
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373588561
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81046)
Time Spent: 64h 50m  (was: 64h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6216

2018-03-15 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostRelease_NightlySnapshot #130

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 42.71 KB...]
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-descriptor/3.0.1/archetype-descriptor-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-common/3.0.1/archetype-common-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-component-annotations/1.6/plexus-component-annotations-1.6.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/net/sourceforge/jchardet/jchardet/1.0/jchardet-1.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-descriptor/3.0.1/archetype-descriptor-3.0.1.jar
 (24 kB at 679 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/dom4j/dom4j/1.6.1/dom4j-1.6.1.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-component-annotations/1.6/plexus-component-annotations-1.6.jar
 (4.3 kB at 79 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.0.b2/xml-apis-1.0.b2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-catalog/3.0.1/archetype-catalog-3.0.1.jar
 (19 kB at 293 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/jdom/jdom/1.0/jdom-1.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/net/sourceforge/jchardet/jchardet/1.0/jchardet-1.0.jar
 (27 kB at 391 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/3.0/maven-artifact-3.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-common/3.0.1/archetype-common-3.0.1.jar
 (331 kB at 4.2 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-settings-builder/3.0/maven-settings-builder-3.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/3.0/maven-artifact-3.0.jar
 (52 kB at 509 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-io/commons-io/2.2/commons-io-2.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-settings-builder/3.0/maven-settings-builder-3.0.jar
 (38 kB at 363 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-velocity/1.1.8/plexus-velocity-1.1.8.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/dom4j/dom4j/1.6.1/dom4j-1.6.1.jar (314 kB 
at 2.9 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/velocity/velocity/1.7/velocity-1.7.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.0.b2/xml-apis-1.0.b2.jar
 (109 kB at 976 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.4/commons-lang-2.4.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/jdom/jdom/1.0/jdom-1.0.jar (153 kB at 1.3 
MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/wagon/wagon-provider-api/2.8/wagon-provider-api-2.8.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-velocity/1.1.8/plexus-velocity-1.1.8.jar
 (7.9 kB at 61 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/groovy/groovy/1.8.3/groovy-1.8.3.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-io/commons-io/2.2/commons-io-2.2.jar
 (174 kB at 1.2 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/antlr/antlr/2.7.7/antlr-2.7.7.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/wagon/wagon-provider-api/2.8/wagon-provider-api-2.8.jar
 (53 kB at 358 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm/3.2/asm-3.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm/3.2/asm-3.2.jar (43 kB at 244 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-commons/3.2/asm-commons-3.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/antlr/antlr/2.7.7/antlr-2.7.7.jar (445 kB 
at 2.4 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-util/3.2/asm-util-3.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/velocity/velocity/1.7/velocity-1.7.jar
 (450 kB at 2.4 MB/s)
[INFO] Downloading from central: 

Build failed in Jenkins: beam_PostRelease_NightlySnapshot #129

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 38.72 KB...]
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/antlr/antlr/2.7.7/antlr-2.7.7.pom (632 B 
at 20 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm/3.2/asm-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm/3.2/asm-3.2.pom (264 B at 10 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-parent/3.2/asm-parent-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm-parent/3.2/asm-parent-3.2.pom (4.4 
kB at 161 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-commons/3.2/asm-commons-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm-commons/3.2/asm-commons-3.2.pom 
(415 B at 15 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-tree/3.2/asm-tree-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm-tree/3.2/asm-tree-3.2.pom (404 B 
at 16 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-util/3.2/asm-util-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm-util/3.2/asm-util-3.2.pom (409 B 
at 17 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm-analysis/3.2/asm-analysis-3.2.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/asm/asm-analysis/3.2/asm-analysis-3.2.pom 
(417 B at 11 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-6/plexus-interactivity-api-1.0-alpha-6.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-6/plexus-interactivity-api-1.0-alpha-6.pom
 (726 B at 28 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-interactivity/1.0-alpha-6/plexus-interactivity-1.0-alpha-6.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-interactivity/1.0-alpha-6/plexus-interactivity-1.0-alpha-6.pom
 (1.1 kB at 43 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.9/plexus-components-1.1.9.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-components/1.1.9/plexus-components-1.1.9.pom
 (2.4 kB at 93 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/1.0.10/plexus-1.0.10.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus/1.0.10/plexus-1.0.10.pom
 (8.2 kB at 317 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-script-interpreter/1.0/maven-script-interpreter-1.0.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-script-interpreter/1.0/maven-script-interpreter-1.0.pom
 (3.8 kB at 148 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/17/maven-shared-components-17.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-components/17/maven-shared-components-17.pom
 (8.7 kB at 322 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/beanshell/bsh/2.0b4/bsh-2.0b4.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/beanshell/bsh/2.0b4/bsh-2.0b4.pom (1.2 
kB at 46 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/beanshell/beanshell/2.0b4/beanshell-2.0b4.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/beanshell/beanshell/2.0b4/beanshell-2.0b4.pom
 (1.4 kB at 54 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.pom 
(8.8 kB at 338 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant-parent/1.8.1/ant-parent-1.8.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant-parent/1.8.1/ant-parent-1.8.1.pom
 (4.3 kB at 165 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-catalog/3.0.1/archetype-catalog-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-common/3.0.1/archetype-common-3.0.1.jar
[INFO] Downloaded from central: 

Build failed in Jenkins: beam_PerformanceTests_Spark #1471

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[sidhom] Add Java bounded read overrides

[coheigea] Replacing size() == 0 with isEmpty()

[lukasz.gajowy] [BEAM-3798] Remove error check on dataflow when getting batch 
job state

[XuMingmin] [SQL] Add support for DOT expression (#4863)

--
[...truncated 88.19 KB...]
'apache-beam-testing:bqjob_r5063501109f307ea_01622c6027ac_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r5063501109f307ea_01622c6027ac_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r5063501109f307ea_01622c6027ac_1 ... (0s) Current status: DONE   
2018-03-16 01:15:23,810 cb9865ca MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-16 01:15:44,538 cb9865ca MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-16 01:15:46,914 cb9865ca MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r6fd432861dc9614b_01622c608169_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r6fd432861dc9614b_01622c608169_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r6fd432861dc9614b_01622c608169_1 ... (0s) Current status: DONE   
2018-03-16 01:15:46,914 cb9865ca MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-16 01:16:15,417 cb9865ca MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-16 01:16:17,669 cb9865ca MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5fcd23f058c0a3e7_01622c60fa00_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r5fcd23f058c0a3e7_01622c60fa00_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r5fcd23f058c0a3e7_01622c60fa00_1 ... (0s) Current status: DONE   
2018-03-16 01:16:17,669 cb9865ca MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-16 01:16:40,498 cb9865ca MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-16 01:16:42,679 cb9865ca MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81041=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81041
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 01:45
Start Date: 16/Mar/18 01:45
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373577423
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81041)
Time Spent: 64h 40m  (was: 64.5h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64h 40m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3246) BigtableIO should merge splits if they exceed 15K

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3246?focusedWorklogId=81035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81035
 ]

ASF GitHub Bot logged work on BEAM-3246:


Author: ASF GitHub Bot
Created on: 16/Mar/18 01:24
Start Date: 16/Mar/18 01:24
Worklog Time Spent: 10m 
  Work Description: arkash commented on issue #4517: [BEAM-3246] Bigtable: 
Merge splits if they exceed 15K
URL: https://github.com/apache/beam/pull/4517#issuecomment-373574146
 
 
   @sduskis have completed discussed changes and added additional test cases 
please review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81035)
Time Spent: 3h 20m  (was: 3h 10m)

> BigtableIO should merge splits if they exceed 15K
> -
>
> Key: BEAM-3246
> URL: https://issues.apache.org/jira/browse/BEAM-3246
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> A customer hit a problem with a large number of splits.  CloudBitableIO fixes 
> that here 
> https://github.com/GoogleCloudPlatform/cloud-bigtable-client/blob/master/bigtable-dataflow-parent/bigtable-hbase-beam/src/main/java/com/google/cloud/bigtable/beam/CloudBigtableIO.java#L241
> BigtableIO should have similar logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81036
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 01:28
Start Date: 16/Mar/18 01:28
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373574631
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81036)
Time Spent: 64.5h  (was: 64h 20m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64.5h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1027

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[sidhom] Add Java bounded read overrides

[coheigea] Replacing size() == 0 with isEmpty()

[lukasz.gajowy] [BEAM-3798] Remove error check on dataflow when getting batch 
job state

[XuMingmin] [SQL] Add support for DOT expression (#4863)

--
[...truncated 493.58 KB...]
[INFO] 
[INFO] --- maven-shade-plugin:3.1.0:shade (bundle-and-repackage) @ 
beam-sdks-java-maven-archetypes-examples ---
[INFO] Excluding org.apache.beam:beam-examples-java:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding org.apache.beam:beam-sdks-java-core:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding com.google.guava:guava-testlib:jar:20.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java:jar:3.2.0 from the shaded 
jar.
[INFO] Excluding com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1 
from the shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-core:jar:2.8.9 from the 
shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-annotations:jar:2.8.9 from 
the shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-databind:jar:2.8.9 from the 
shaded jar.
[INFO] Excluding net.bytebuddy:byte-buddy:jar:1.7.10 from the shaded jar.
[INFO] Excluding org.xerial.snappy:snappy-java:jar:1.1.4 from the shaded jar.
[INFO] Excluding org.apache.commons:commons-compress:jar:1.14 from the shaded 
jar.
[INFO] Excluding org.apache.commons:commons-lang3:jar:3.6 from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:2.5.0-SNAPSHOT
 from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:gcsio:jar:1.4.5 from the shaded 
jar.
[INFO] Excluding 
com.google.apis:google-api-services-cloudresourcemanager:jar:v1-rev6-1.22.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.auto.value:auto-value:jar:1.5.3 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.code.gson:gson:jar:2.7 from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] 

Jenkins build is back to normal : beam_PerformanceTests_TFRecordIOIT #254

2018-03-15 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_JDBC #334

2018-03-15 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_HadoopInputFormat #23

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81033
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 00:53
Start Date: 16/Mar/18 00:53
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373569087
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81033)
Time Spent: 64h 20m  (was: 64h 10m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64h 20m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=81032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81032
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 16/Mar/18 00:40
Start Date: 16/Mar/18 00:40
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373566965
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81032)
Time Spent: 64h 10m  (was: 64h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64h 10m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1121

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[lukasz.gajowy] [BEAM-3798] Remove error check on dataflow when getting batch 
job state

--
[...truncated 124.24 KB...]
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
test_as_dict_twice (apache_beam.transforms.sideinputs_test.SideInputsTest) ... 
ok
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.1.1.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
test_as_list_twice (apache_beam.transforms.sideinputs_test.SideInputsTest) ... 
ok
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-38.6.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.1.1.tar.gz
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-38.6.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.1.1.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
test_as_singleton_with_different_defaults 
(apache_beam.transforms.sideinputs_test.SideInputsTest) ... ok
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-38.6.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.1.1.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
test_as_singleton_without_unique_labels 
(apache_beam.transforms.sideinputs_test.SideInputsTest) ... ok
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5160

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3418?focusedWorklogId=81030=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81030
 ]

ASF GitHub Bot logged work on BEAM-3418:


Author: ASF GitHub Bot
Created on: 16/Mar/18 00:20
Start Date: 16/Mar/18 00:20
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #4587: 
[BEAM-3418] Send worker_id in all grpc channels to runner harness
URL: https://github.com/apache/beam/pull/4587#discussion_r174969852
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
 ##
 @@ -176,6 +176,13 @@ def __init__(self, packages, options, 
environment_version, pipeline_url):
 if self.debug_options.experiments:
   for experiment in self.debug_options.experiments:
 self.proto.experiments.append(experiment)
+# Add MULTIPLE_SDK_CONTAINERS flag if its not already present. Do not add
+# the flag if 'NO_MULTIPLE_SDK_CONTAINERS' is present.
+# TODO: Cleanup MULTIPLE_SDK_CONTAINERS once we depricate Python SDK till
+# version 2.4.
+if ('MULTIPLE_SDK_CONTAINERS' not in self.proto.experiments and
 
 Review comment:
   @robertwb @aaltay I am planning to make this feature opt out for new SDKs. 
Instead should we keep it opt in?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81030)
Time Spent: 50m  (was: 40m)

> Python Fnapi - Support Multiple SDK workers on a single VM
> --
>
> Key: BEAM-3418
> URL: https://issues.apache.org/jira/browse/BEAM-3418
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: performance, portability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Support multiple python SDK process on a VM to fully utilize a machine.
> Each SDK Process will work in isolation and interact with Runner Harness 
> independently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3418?focusedWorklogId=81028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81028
 ]

ASF GitHub Bot logged work on BEAM-3418:


Author: ASF GitHub Bot
Created on: 16/Mar/18 00:15
Start Date: 16/Mar/18 00:15
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #4587: 
[BEAM-3418] Send worker_id in all grpc channels to runner harness
URL: https://github.com/apache/beam/pull/4587#discussion_r174969164
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
 ##
 @@ -176,6 +176,13 @@ def __init__(self, packages, options, 
environment_version, pipeline_url):
 if self.debug_options.experiments:
   for experiment in self.debug_options.experiments:
 self.proto.experiments.append(experiment)
+# Add MULTIPLE_SDK_CONTAINERS flag if its not already present. Do not add
+# the flag if 'NO_MULTIPLE_SDK_CONTAINERS' is present.
+# TODO: Cleanup MULTIPLE_SDK_CONTAINERS once we depricate Python SDK till
+# version 2.4.
+if ('MULTIPLE_SDK_CONTAINERS' not in self.proto.experiments and
 
 Review comment:
   I expect this CL to get in 2.5.
   In a way this flag is required to help router distinguish between old SDK 
(sdk till 2.4) and new SDK (sdk from 2.5). So once we do not have any sdk which 
is older than 2.5, we don't need to distinguish between sdk atleast for 
MultiSdk functionality and hence it automatically becomes the default feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81028)
Time Spent: 40m  (was: 0.5h)

> Python Fnapi - Support Multiple SDK workers on a single VM
> --
>
> Key: BEAM-3418
> URL: https://issues.apache.org/jira/browse/BEAM-3418
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: performance, portability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Support multiple python SDK process on a VM to fully utilize a machine.
> Each SDK Process will work in isolation and interact with Runner Harness 
> independently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3418) Python Fnapi - Support Multiple SDK workers on a single VM

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3418?focusedWorklogId=81027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81027
 ]

ASF GitHub Bot logged work on BEAM-3418:


Author: ASF GitHub Bot
Created on: 16/Mar/18 00:09
Start Date: 16/Mar/18 00:09
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #4587: 
[BEAM-3418] Send worker_id in all grpc channels to runner harness
URL: https://github.com/apache/beam/pull/4587#discussion_r174968455
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
 ##
 @@ -176,6 +176,13 @@ def __init__(self, packages, options, 
environment_version, pipeline_url):
 if self.debug_options.experiments:
   for experiment in self.debug_options.experiments:
 self.proto.experiments.append(experiment)
+# Add MULTIPLE_SDK_CONTAINERS flag if its not already present. Do not add
+# the flag if 'NO_MULTIPLE_SDK_CONTAINERS' is present.
+# TODO: Cleanup MULTIPLE_SDK_CONTAINERS once we depricate Python SDK till
+# version 2.4.
+if ('MULTIPLE_SDK_CONTAINERS' not in self.proto.experiments and
 
 Review comment:
   How confident are we to make this a default behavior for 2.5?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81027)
Time Spent: 0.5h  (was: 20m)

> Python Fnapi - Support Multiple SDK workers on a single VM
> --
>
> Key: BEAM-3418
> URL: https://issues.apache.org/jira/browse/BEAM-3418
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: performance, portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Support multiple python SDK process on a VM to fully utilize a machine.
> Each SDK Process will work in isolation and interact with Runner Harness 
> independently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #6215

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3798) Performance tests flaky due to Dataflow transient errors

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3798?focusedWorklogId=81021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81021
 ]

ASF GitHub Bot logged work on BEAM-3798:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:26
Start Date: 15/Mar/18 23:26
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #4871: [BEAM-3798] Remove 
error check on dataflow when getting batch job state
URL: https://github.com/apache/beam/pull/4871
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java
index e163fe8d674..8679a952284 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java
@@ -181,7 +181,7 @@ private boolean waitForStreamingJobTermination(
   }
 
   /**
-   * Return {@code true} if the job succeeded or {@code false} if it 
terminated in any other manner.
+   * Return {@code true} if job state is {@code State.DONE}. {@code false} 
otherwise.
*/
   private boolean waitForBatchJobTermination(
   DataflowPipelineJob job, ErrorMonitorMessagesHandler messageHandler) {
@@ -195,7 +195,7 @@ private boolean waitForBatchJobTermination(
 return false;
   }
 
-  return job.getState() == State.DONE && !messageHandler.hasSeenError();
+  return job.getState() == State.DONE;
 }
   }
 
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/TestDataflowRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/TestDataflowRunnerTest.java
index f382e4b6ed2..cf54556a093 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/TestDataflowRunnerTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/TestDataflowRunnerTest.java
@@ -121,6 +121,40 @@ public void testRunBatchJobThatSucceeds() throws Exception 
{
 assertEquals(mockJob, runner.run(p, mockRunner));
   }
 
+  /**
+   * Job success on Dataflow means that it handled transient errors (if any) 
successfully
+   * by retrying failed bundles.
+   */
+  @Test
+  public void testRunBatchJobThatSucceedsDespiteTransientErrors() throws 
Exception {
+Pipeline p = Pipeline.create(options);
+PCollection pc = p.apply(Create.of(1, 2, 3));
+PAssert.that(pc).containsInAnyOrder(1, 2, 3);
+
+DataflowPipelineJob mockJob = Mockito.mock(DataflowPipelineJob.class);
+when(mockJob.getState()).thenReturn(State.DONE);
+when(mockJob.getProjectId()).thenReturn("test-project");
+when(mockJob.getJobId()).thenReturn("test-job");
+when(mockJob.waitUntilFinish(any(Duration.class), 
any(JobMessagesHandler.class)))
+  .thenAnswer(
+invocation -> {
+  JobMessage message = new JobMessage();
+  message.setMessageText("TransientError");
+  message.setTime(TimeUtil.toCloudTime(Instant.now()));
+  message.setMessageImportance("JOB_MESSAGE_ERROR");
+  ((JobMessagesHandler) 
invocation.getArguments()[1]).process(Arrays.asList(message));
+  return State.DONE;
+});
+
+DataflowRunner mockRunner = Mockito.mock(DataflowRunner.class);
+when(mockRunner.run(any(Pipeline.class))).thenReturn(mockJob);
+
+TestDataflowRunner runner = 
TestDataflowRunner.fromOptionsAndClient(options, mockClient);
+when(mockClient.getJobMetrics(anyString()))
+  .thenReturn(generateMockMetricResponse(true /* success */, true /* 
tentative */));
+assertEquals(mockJob, runner.run(p, mockRunner));
+  }
+
   /**
* Tests that when a batch job terminates in a failure state even if all 
assertions
* passed, it throws an error to that effect.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81021)
Time Spent: 40m  (was: 0.5h)

> Performance tests flaky due to Dataflow transient errors
> 
>
> Key: BEAM-3798
> URL: 

[beam] branch master updated (ba2c648 -> 9f21706)

2018-03-15 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from ba2c648  Merge pull request #4815: Replace size() == 0 with isEmpty()
 add 221f783  [BEAM-3798] Remove error check on dataflow when getting batch 
job state
 new 9f21706  [BEAM-3798] Remove error check on dataflow when getting batch 
job state

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../beam/runners/dataflow/TestDataflowRunner.java  |  4 +--
 .../runners/dataflow/TestDataflowRunnerTest.java   | 34 ++
 2 files changed, 36 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] 01/01: [BEAM-3798] Remove error check on dataflow when getting batch job state

2018-03-15 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 9f2170630cbc2d31d6f5da301a17ef1179d745c1
Merge: ba2c648 221f783
Author: Lukasz Cwik 
AuthorDate: Thu Mar 15 16:26:47 2018 -0700

[BEAM-3798] Remove error check on dataflow when getting batch job state

 .../beam/runners/dataflow/TestDataflowRunner.java  |  4 +--
 .../runners/dataflow/TestDataflowRunnerTest.java   | 34 ++
 2 files changed, 36 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=81020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81020
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:24
Start Date: 15/Mar/18 23:24
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #4812:  
[BEAM-3249] Publish java artifacts using gradle
URL: https://github.com/apache/beam/pull/4812#discussion_r174961399
 
 

 ##
 File path: sdks/java/maven-archetypes/examples/build.gradle
 ##
 @@ -17,10 +17,53 @@
  */
 
 apply from: project(":").file("build_rules.gradle")
-applyJavaNature()
+applyJavaNature(artifactId: "beam-sdks-java-maven-archetypes-examples")
 
 description = "Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples"
 
+// Extract the version from a library:version dependency.
+def ver = { return it.substring(1+it.lastIndexOf(':')) }
+
+processResources {
+  filter org.apache.tools.ant.filters.ReplaceTokens, tokens: [
+'project.version':  version,
+'bigquery.version': ver(project.library.java.google_api_services_bigquery),
+'google-clients.version': ver(project.library.java.google_api_client),
+'guava.version': ver(project.library.java.guava),
+'hamcrest.version': ver(project.library.java.hamcrest_library),
+'jackson.version': ver(project.library.java.jackson_core),
+'joda.version': ver(project.library.java.joda_time),
+'junit.version': ver(project.library.java.junit),
+'pubsub.version': ver(project.library.java.google_api_services_pubsub),
+'slf4j.version': ver(project.library.java.slf4j_api),
+'spark.version': ver(project.library.java.spark_core),
+'hadoop.version': ver(project.library.java.hadoop_client),
+'mockito.version': ver(project.library.java.mockito_core),
+'maven-compiler-plugin.version': '3.7.0',
 
 Review comment:
   It would be useful to elevate all the Maven dependencies to the project root 
build.gradle file so that its not duplicated for each archetype.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81020)
Time Spent: 5h 50m  (was: 5h 40m)

> Use Gradle to build/release project
> ---
>
> Key: BEAM-3249
> URL: https://issues.apache.org/jira/browse/BEAM-3249
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> I have collected data by running several builds against master using Gradle 
> and Maven without using Gradle's support for incremental builds.
> Gradle (mins)
> min: 25.04
> max: 160.14
> median: 45.78
> average: 52.19
> stdev: 30.80
> Maven (mins)
> min: 56.86
> max: 216.55
> median: 87.93
> average: 109.10
> stdev: 48.01
> I excluded a few timeouts (240 mins) that happened during the Maven build 
> from its numbers but we can see conclusively that Gradle is about twice as 
> fast for the build when compared to Maven when run using Jenkins.
> Original dev@ thread: 
> https://lists.apache.org/thread.html/225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%3Cdev.beam.apache.org%3E
> The data is available here 
> https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=81019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81019
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:23
Start Date: 15/Mar/18 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #4812:  
[BEAM-3249] Publish java artifacts using gradle
URL: https://github.com/apache/beam/pull/4812#discussion_r174961399
 
 

 ##
 File path: sdks/java/maven-archetypes/examples/build.gradle
 ##
 @@ -17,10 +17,53 @@
  */
 
 apply from: project(":").file("build_rules.gradle")
-applyJavaNature()
+applyJavaNature(artifactId: "beam-sdks-java-maven-archetypes-examples")
 
 description = "Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples"
 
+// Extract the version from a library:version dependency.
+def ver = { return it.substring(1+it.lastIndexOf(':')) }
+
+processResources {
+  filter org.apache.tools.ant.filters.ReplaceTokens, tokens: [
+'project.version':  version,
+'bigquery.version': ver(project.library.java.google_api_services_bigquery),
+'google-clients.version': ver(project.library.java.google_api_client),
+'guava.version': ver(project.library.java.guava),
+'hamcrest.version': ver(project.library.java.hamcrest_library),
+'jackson.version': ver(project.library.java.jackson_core),
+'joda.version': ver(project.library.java.joda_time),
+'junit.version': ver(project.library.java.junit),
+'pubsub.version': ver(project.library.java.google_api_services_pubsub),
+'slf4j.version': ver(project.library.java.slf4j_api),
+'spark.version': ver(project.library.java.spark_core),
+'hadoop.version': ver(project.library.java.hadoop_client),
+'mockito.version': ver(project.library.java.mockito_core),
+'maven-compiler-plugin.version': '3.7.0',
 
 Review comment:
   It would be useful to elevate all the Maven dependencies to the project root 
build.gradle file so that its not duplicated or each archetype.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81019)
Time Spent: 5h 40m  (was: 5.5h)

> Use Gradle to build/release project
> ---
>
> Key: BEAM-3249
> URL: https://issues.apache.org/jira/browse/BEAM-3249
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> I have collected data by running several builds against master using Gradle 
> and Maven without using Gradle's support for incremental builds.
> Gradle (mins)
> min: 25.04
> max: 160.14
> median: 45.78
> average: 52.19
> stdev: 30.80
> Maven (mins)
> min: 56.86
> max: 216.55
> median: 87.93
> average: 109.10
> stdev: 48.01
> I excluded a few timeouts (240 mins) that happened during the Maven build 
> from its numbers but we can see conclusively that Gradle is about twice as 
> fast for the build when compared to Maven when run using Jenkins.
> Original dev@ thread: 
> https://lists.apache.org/thread.html/225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%3Cdev.beam.apache.org%3E
> The data is available here 
> https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=81018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81018
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:23
Start Date: 15/Mar/18 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #4812:  
[BEAM-3249] Publish java artifacts using gradle
URL: https://github.com/apache/beam/pull/4812#discussion_r174961281
 
 

 ##
 File path: sdks/java/maven-archetypes/examples/build.gradle
 ##
 @@ -17,10 +17,53 @@
  */
 
 apply from: project(":").file("build_rules.gradle")
-applyJavaNature()
+applyJavaNature(artifactId: "beam-sdks-java-maven-archetypes-examples")
 
 description = "Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples"
 
+// Extract the version from a library:version dependency.
+def ver = { return it.substring(1+it.lastIndexOf(':')) }
 
 Review comment:
   This doesn't work if the version has a classifier attached to it.
   Please create a Dependency object and call getVersion on it:
   
https://docs.gradle.org/current/javadoc/org/gradle/api/artifacts/Dependency.html#getVersion--


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81018)
Time Spent: 5.5h  (was: 5h 20m)

> Use Gradle to build/release project
> ---
>
> Key: BEAM-3249
> URL: https://issues.apache.org/jira/browse/BEAM-3249
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> I have collected data by running several builds against master using Gradle 
> and Maven without using Gradle's support for incremental builds.
> Gradle (mins)
> min: 25.04
> max: 160.14
> median: 45.78
> average: 52.19
> stdev: 30.80
> Maven (mins)
> min: 56.86
> max: 216.55
> median: 87.93
> average: 109.10
> stdev: 48.01
> I excluded a few timeouts (240 mins) that happened during the Maven build 
> from its numbers but we can see conclusively that Gradle is about twice as 
> fast for the build when compared to Maven when run using Jenkins.
> Original dev@ thread: 
> https://lists.apache.org/thread.html/225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%3Cdev.beam.apache.org%3E
> The data is available here 
> https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81015
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:21
Start Date: 15/Mar/18 23:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#4870: [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174959895
 
 

 ##
 File path: pom.xml
 ##
 @@ -186,6 +186,8 @@
 nothing
 0.20.0
 
 
 Review comment:
   Please remove the empty line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81015)
Time Spent: 3h 20m  (was: 3h 10m)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81014
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:21
Start Date: 15/Mar/18 23:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#4870: [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174959913
 
 

 ##
 File path: sdks/java/io/file-based-io-tests/pom.xml
 ##
 @@ -294,6 +294,12 @@
 ${apache.hadoop.version}
 runtime
 
+
+javax.xml.bind
+jaxb-api
 
 Review comment:
   Thanks. Can you also define JAXB dependencies in 
https://github.com/apache/beam/blob/master/sdks/java/io/xml/pom.xml in root 
level and update that component to use the version defined in the root level. 
Also, can't we use 2.2.3 instead of 2.2.2 which seems to be pretty old ? I 
think we could run into issues if we have to use both 2.2.0 and 2.2.3 for 
different components.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81014)
Time Spent: 3h 10m  (was: 3h)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81016
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 23:21
Start Date: 15/Mar/18 23:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #4870: [BEAM-3060] 
Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#issuecomment-373553567
 
 
   cc: @kennknowles @lukecwik 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81016)
Time Spent: 3.5h  (was: 3h 20m)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostRelease_NightlySnapshot #128

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 54.42 KB...]
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-invoker/2.2/maven-invoker-2.2.jar
 (30 kB at 55 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/com/google/code/findbugs/jsr305/2.0.1/jsr305-2.0.1.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-artifact-transfer/0.9.0/maven-artifact-transfer-0.9.0.jar
 (123 kB at 224 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-common-artifact-filters/3.0.0/maven-common-artifact-filters-3.0.0.jar
 (57 kB at 104 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-utils/3.0.0/maven-shared-utils-3.0.0.jar
 (155 kB at 271 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/com/google/code/findbugs/jsr305/2.0.1/jsr305-2.0.1.jar
 (32 kB at 56 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-script-interpreter/1.0/maven-script-interpreter-1.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar
 (26 kB at 45 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-script-interpreter/1.0/maven-script-interpreter-1.0.jar
 (21 kB at 35 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
 (233 kB at 387 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar (282 
kB at 452 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar
 (575 kB at 877 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.jar 
(1.5 MB at 2.1 MB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/groovy/groovy/1.8.3/groovy-1.8.3.jar
 (5.5 MB at 6.1 MB/s)
[INFO] Generating project in Batch mode
[INFO] Archetype repository not defined. Using the one from 
[org.apache.beam:beam-sdks-java-maven-archetypes-examples:2.3.0] found in 
catalog remote
[INFO] Downloading from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-examples/2.5.0-SNAPSHOT/maven-metadata.xml
[INFO] Downloaded from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-examples/2.5.0-SNAPSHOT/maven-metadata.xml
 (1.5 kB at 4.7 kB/s)
[INFO] Downloading from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-examples/2.5.0-SNAPSHOT/beam-sdks-java-maven-archetypes-examples-2.5.0-20180315.081450-11.pom
[INFO] Downloaded from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-examples/2.5.0-SNAPSHOT/beam-sdks-java-maven-archetypes-examples-2.5.0-20180315.081450-11.pom
 (5.7 kB at 23 kB/s)
[INFO] Downloading from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-parent/2.5.0-SNAPSHOT/maven-metadata.xml
[INFO] Downloaded from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-parent/2.5.0-SNAPSHOT/maven-metadata.xml
 (632 B at 2.8 kB/s)
[INFO] Downloading from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-parent/2.5.0-SNAPSHOT/beam-sdks-java-maven-archetypes-parent-2.5.0-20180315.080652-11.pom
[INFO] Downloaded from test.release: 
https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-maven-archetypes-parent/2.5.0-SNAPSHOT/beam-sdks-java-maven-archetypes-parent-2.5.0-20180315.080652-11.pom
 (3.9 kB at 17 kB/s)
[INFO] Downloading from test.release: 

[jira] [Resolved] (BEAM-3757) Shuffle read failed using python 2.2.0

2018-03-15 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-3757.
---
   Resolution: Fixed
 Assignee: (was: Thomas Groh)
Fix Version/s: Not applicable

Closing this, based on the latest comment.

> Shuffle read failed using python 2.2.0
> --
>
> Key: BEAM-3757
> URL: https://issues.apache.org/jira/browse/BEAM-3757
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: gcp, macos
>Reporter: Jonathan Delfour
>Priority: Major
> Fix For: Not applicable
>
>
> Hi,
> First issue is that the beam 2.3.0 python SDK is apparently not working on 
> GCP. It gets stuck: 
> {noformat}
> Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be 
> stuck. You can get help with Cloud Dataflow at 
> https://cloud.google.com/dataflow/support. 
> {noformat}
> I tried two times.
> Reverting back to 2.2.0: it usually works but today, after > 1 hour of 
> processing, and 30 workers used, I get a failure with these in the logs:
> {noformat}
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 
> 582, in do_work
> work_executor.execute()
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", 
> line 167, in execute
> op.start()
>   File "dataflow_worker/shuffle_operations.py", line 49, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> def start(self):
>   File "dataflow_worker/shuffle_operations.py", line 50, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> with self.scoped_start_state:
>   File "dataflow_worker/shuffle_operations.py", line 65, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> with self.shuffle_source.reader() as reader:
>   File "dataflow_worker/shuffle_operations.py", line 67, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> for key_values in reader:
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 406, in __iter__
> for entry in entries_iterator:
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 248, in next
> return next(self.iterator)
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 206, in __iter__
> chunk, next_position = self.reader.Read(start_position, end_position)
>   File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in 
> shuffle_client.PyShuffleReader.Read
> IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2 
>  talking to my-dataflow-02271107-756f-harness-2p65:12346
> {noformat}
> i also get some information message:
> {noformat}
> Refusing to split  at 0x7f03a00fe790> at '\x00\x00\x00\t\x1d\x14\x87\xa3\x00\x01': unstarted
> {noformat}
> For the flow, I am extracting data from BQ, cleaning using pandas, exporting 
> as a csv file, gzipping and uploading the compressed file to a bucket using 
> decompressive transcoding (csv export, gzip compression and upload are in the 
> same 'worker' as they are done in the same beam.DoFn).
> PS: i can't find a reasonable way to export the logs from GCP but i can 
> privately send the log file i have of the run on my machine (the log of the 
> pipeline)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3761) Fix Python 3 cmp function

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?focusedWorklogId=81008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81008
 ]

ASF GitHub Bot logged work on BEAM-3761:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:57
Start Date: 15/Mar/18 22:57
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #4774: [BEAM-3761]Fix Python 
3 cmp usage
URL: https://github.com/apache/beam/pull/4774#issuecomment-373549047
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81008)
Time Spent: 6h 20m  (was: 6h 10m)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81005
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:55
Start Date: 15/Mar/18 22:55
Worklog Time Spent: 10m 
  Work Description: szewi commented on a change in pull request #4870: 
[BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174956941
 
 

 ##
 File path: sdks/java/io/file-based-io-tests/pom.xml
 ##
 @@ -294,6 +294,12 @@
 ${apache.hadoop.version}
 runtime
 
+
+javax.xml.bind
+jaxb-api
 
 Review comment:
   No problem, I will move it to beam/pom.xml, but I need to run local tests, 
before another pull.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81005)
Time Spent: 3h  (was: 2h 50m)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3761) Fix Python 3 cmp function

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?focusedWorklogId=81001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81001
 ]

ASF GitHub Bot logged work on BEAM-3761:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:51
Start Date: 15/Mar/18 22:51
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #4774: 
[BEAM-3761]Fix Python 3 cmp usage
URL: https://github.com/apache/beam/pull/4774#discussion_r174956257
 
 

 ##
 File path: sdks/python/apache_beam/testing/test_stream.py
 ##
 @@ -46,44 +46,65 @@ class Event(object):
 
   __metaclass__ = ABCMeta
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return self._typed_cmp(other)
-
-  @abstractmethod
-  def _typed_cmp(self, other):
-raise NotImplementedError
-
 
+@total_ordering
 class ElementEvent(Event):
   """Element-producing test stream event."""
 
   def __init__(self, timestamped_values):
 self.timestamped_values = timestamped_values
 
-  def _typed_cmp(self, other):
-return cmp(self.timestamped_values, other.timestamped_values)
+  def __eq__(self, other):
+return (type(self) is type(other) and
+self.timestamped_values == other.timestamped_values)
+
+  def __ne__(self, other):
 
 Review comment:
   Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81001)
Time Spent: 5h 50m  (was: 5h 40m)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81004
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:51
Start Date: 15/Mar/18 22:51
Worklog Time Spent: 10m 
  Work Description: szewi commented on a change in pull request #4870: 
[BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174956360
 
 

 ##
 File path: sdks/java/io/file-based-io-tests/pom.xml
 ##
 @@ -294,6 +294,12 @@
 ${apache.hadoop.version}
 runtime
 
+
+javax.xml.bind
+jaxb-api
 
 Review comment:
   Added to dependencyManagement from java/io. Version removed from filebased 
pom.xml as because it's defined in dependencyManagement and this one will be 
used. To be consistent with scopes in dependencyManagement scope was also 
changed to test and it was sufficient to run tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81004)
Time Spent: 2h 50m  (was: 2h 40m)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=81003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81003
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:51
Start Date: 15/Mar/18 22:51
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#4870: [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174956339
 
 

 ##
 File path: sdks/java/io/file-based-io-tests/pom.xml
 ##
 @@ -294,6 +294,12 @@
 ${apache.hadoop.version}
 runtime
 
+
+javax.xml.bind
+jaxb-api
 
 Review comment:
   Sorry, by root level I meant beam/pom.xml


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81003)
Time Spent: 2h 40m  (was: 2.5h)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=80995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80995
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956108
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/evaluation_context.py
 ##
 @@ -217,11 +241,12 @@ def handle_result(
 self._side_inputs_container.add_values(
 view,
 committed_bundle.get_elements_iterable(make_copy=True))
-  if (self.get_execution_context(result.transform)
-  .watermarks.input_watermark
-  == WatermarkManager.WATERMARK_POS_INF):
-self._pending_unblocked_tasks.extend(
-self._side_inputs_container.finalize_value_and_get_tasks(view))
+
+  tasks = self._watermark_manager.update_watermarks(
 
 Review comment:
   
   Please add comment that these tasks come from unblocked side inputs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80995)
Time Spent: 20m  (was: 10m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=80996=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80996
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956111
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/evaluation_context.py
 ##
 @@ -56,6 +56,11 @@ def __init__(self, view):
 self.value = None
 self.has_result = False
 
+  def __repr__(self):
+elements = ', '.join([
+str(elm) for elm in self.elements] if self.elements else [''])
 
 Review comment:
   
   Can you do the following?
   
   ```
   elements_string = ', '.join(str(elm) for elm in self.elements) if 
self.elements else ''
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80996)
Time Spent: 20m  (was: 10m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=80998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80998
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956113
 
 

 ##
 File path: sdks/python/apache_beam/testing/test_stream_test.py
 ##
 @@ -245,6 +247,82 @@ def fired_elements(elem):
 # TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
 self.assertEqual([('k', ['a'])], result)
 
+  def test_basic_execution_sideinputs_batch(self):
+
+# TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
+global result # pylint: disable=global-variable-undefined
+result = []
+
+def recorded_elements(elem):
+  result.append(elem)
+  return elem
+
+options = PipelineOptions()
+options.view_as(StandardOptions).streaming = True
+p = TestPipeline(options=options)
+
+main_stream = (p
+   | 'main TestStream' >> TestStream()
+   .advance_watermark_to(10)
+   .add_elements(['e']))
+side = (p
+| beam.Create([2, 1, 4])
+| beam.Map(lambda t: window.TimestampedValue(t, t)))
+
+class RecordFn(beam.DoFn):
+  def process(self,
+  elm=beam.DoFn.ElementParam,
+  ts=beam.DoFn.TimestampParam,
+  side=beam.DoFn.SideInputParam):
+yield (elm, ts, side)
+
+records = main_stream | beam.ParDo(RecordFn(), beam.pvalue.AsList(side)) | 
beam.Map(recorded_elements) # pylint: disable=line-too-long, unused-variable
+p.run()
+
+# TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
+self.assertEqual([('e', Timestamp(10), [2, 1, 4])], result)
+
+  def test_basic_execution_sideinputs(self):
+
+# TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
+global result # pylint: disable=global-variable-undefined
+result = []
+
+def recorded_elements(elem):
+  result.append(elem)
+  return elem
+
+options = PipelineOptions()
+options.view_as(StandardOptions).streaming = True
+p = TestPipeline(options=options)
+
+main_stream = (p
+   | 'main TestStream' >> TestStream()
+   .advance_watermark_to(10)
+   .add_elements(['e'])
+   .advance_processing_time(11))
+# TODO(mariagh): Fix this
 
 Review comment:
   
   Please make this comment more detailed, or remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80998)
Time Spent: 20m  (was: 10m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=81000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-81000
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956109
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/evaluation_context.py
 ##
 @@ -99,6 +113,19 @@ def finalize_value_and_get_tasks(self, side_input):
   view.has_result = True
   return result
 
+  def update_watermarks_for_transform(self, ptransform, watermark):
+# Collect tasks that get unblocked as the workflow progresses
 
 Review comment:
   
   nit: period at end of sentence.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 81000)
Time Spent: 0.5h  (was: 20m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=80997=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80997
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956110
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/evaluation_context.py
 ##
 @@ -67,8 +72,17 @@ class _SideInputsContainer(object):
   def __init__(self, views):
 self._lock = threading.Lock()
 self._views = {}
+self._transform_to_views = collections.defaultdict(list)
+
 for view in views:
   self._views[view] = _SideInputView(view)
+  self._transform_to_views[view.pvalue.producer].append(view)
+
+  def __repr__(self):
+views = ', '.join([
+str(elm) for elm in self._views.values()
+] if self._views.values() else [])
 
 Review comment:
   
   Same as above for `_SideInputView.__repr__`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80997)
Time Spent: 20m  (was: 10m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3818) Add support for the streaming side inputs in the Python DirectRunner

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3818?focusedWorklogId=80999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80999
 ]

ASF GitHub Bot logged work on BEAM-3818:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:50
Start Date: 15/Mar/18 22:50
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#4838: [BEAM-3818] Add support for streaming side inputs in the DirectRunner
URL: https://github.com/apache/beam/pull/4838#discussion_r174956114
 
 

 ##
 File path: sdks/python/apache_beam/testing/test_stream_test.py
 ##
 @@ -245,6 +247,82 @@ def fired_elements(elem):
 # TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
 self.assertEqual([('k', ['a'])], result)
 
+  def test_basic_execution_sideinputs_batch(self):
+
+# TODO(BEAM-3377): Remove after assert_that in streaming is fixed.
+global result # pylint: disable=global-variable-undefined
+result = []
+
+def recorded_elements(elem):
+  result.append(elem)
+  return elem
+
+options = PipelineOptions()
+options.view_as(StandardOptions).streaming = True
+p = TestPipeline(options=options)
+
+main_stream = (p
+   | 'main TestStream' >> TestStream()
+   .advance_watermark_to(10)
+   .add_elements(['e']))
+side = (p
+| beam.Create([2, 1, 4])
+| beam.Map(lambda t: window.TimestampedValue(t, t)))
+
+class RecordFn(beam.DoFn):
+  def process(self,
+  elm=beam.DoFn.ElementParam,
+  ts=beam.DoFn.TimestampParam,
+  side=beam.DoFn.SideInputParam):
+yield (elm, ts, side)
+
+records = main_stream | beam.ParDo(RecordFn(), beam.pvalue.AsList(side)) | 
beam.Map(recorded_elements) # pylint: disable=line-too-long, unused-variable
 
 Review comment:
   
   Can you use the multi-line form to avoid the line-too-long?  You can also 
just avoid creating the `records` variable.
   
   ```
   (main_stream
| beam.ParDo(RecordFn, ...)
| beam.Map(...))
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80999)
Time Spent: 0.5h  (was: 20m)

> Add support for the streaming side inputs in the Python DirectRunner
> 
>
> Key: BEAM-3818
> URL: https://issues.apache.org/jira/browse/BEAM-3818
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: María GH
>Assignee: María GH
>Priority: Minor
> Fix For: 3.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The streaming DirectRunner should support streaming side input semantics.  
> Currently, side inputs are only available for globally-windowed side input 
> PCollections.
> Also, empty side inputs cause a pipeline stall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=80994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80994
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 15/Mar/18 22:48
Start Date: 15/Mar/18 22:48
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373547375
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80994)
Time Spent: 64h  (was: 63h 50m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 64h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2339) Jenkins cross JDK version test on Windows

2018-03-15 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-2339:
--

Assignee: (was: Davor Bonaci)

> Jenkins cross JDK version test on Windows
> -
>
> Key: BEAM-2339
> URL: https://issues.apache.org/jira/browse/BEAM-2339
> Project: Beam
>  Issue Type: Task
>  Components: build-system, testing
>Reporter: Mark Liu
>Priority: Major
>
> We can set os variant to choose windows for Jenkins test, which can be 
> combined with JDK version test. So that we can have cross OS / cross JDK 
> version test. 
> This discussion came from 
> https://github.com/apache/beam/pull/3184#pullrequestreview-39303400



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3861) Build test infra for end-to-end streaming test in Python SDK

2018-03-15 Thread Mark Liu (JIRA)
Mark Liu created BEAM-3861:
--

 Summary: Build test infra for end-to-end streaming test in Python 
SDK
 Key: BEAM-3861
 URL: https://issues.apache.org/jira/browse/BEAM-3861
 Project: Beam
  Issue Type: Task
  Components: testing
Reporter: Mark Liu
Assignee: Mark Liu






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-1584) Support clean-up step in integration test

2018-03-15 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-1584.

   Resolution: Fixed
Fix Version/s: Not applicable

> Support clean-up step in integration test
> -
>
> Key: BEAM-1584
> URL: https://issues.apache.org/jira/browse/BEAM-1584
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> Idea comes from: 
> https://github.com/apache/beam/pull/2064/files/628fafed098ac5550356a201c6ccdcdcc2e9604e
> Integration tests in all sdks should be able to do clean-up at the end of 
> each run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2339) Jenkins cross JDK version test on Windows

2018-03-15 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu reassigned BEAM-2339:
--

Assignee: Davor Bonaci  (was: Mark Liu)

> Jenkins cross JDK version test on Windows
> -
>
> Key: BEAM-2339
> URL: https://issues.apache.org/jira/browse/BEAM-2339
> Project: Beam
>  Issue Type: Task
>  Components: build-system, testing
>Reporter: Mark Liu
>Assignee: Davor Bonaci
>Priority: Major
>
> We can set os variant to choose windows for Jenkins test, which can be 
> combined with JDK version test. So that we can have cross OS / cross JDK 
> version test. 
> This discussion came from 
> https://github.com/apache/beam/pull/3184#pullrequestreview-39303400



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-3841) Python TestDataflowRunner should oeverride run_pipeline

2018-03-15 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-3841.
--

> Python TestDataflowRunner should oeverride run_pipeline
> ---
>
> Key: BEAM-3841
> URL: https://issues.apache.org/jira/browse/BEAM-3841
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py]
>  is inherited from 
> [DataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py].
>  Basically, it wraps DataflowRunner.run_pipeline and provide more test 
> actions. 
> However DataflowRunner.run renamed to run_pipeline in [this 
> commit|https://github.com/apache/beam/commit/8cf222d3db1188aff5432af548961fc670f97635],
>  but run function in TestDataflowRunner didn't change.
> We should change it accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3841) Python TestDataflowRunner should oeverride run_pipeline

2018-03-15 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-3841.

   Resolution: Fixed
Fix Version/s: Not applicable

> Python TestDataflowRunner should oeverride run_pipeline
> ---
>
> Key: BEAM-3841
> URL: https://issues.apache.org/jira/browse/BEAM-3841
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py]
>  is inherited from 
> [DataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py].
>  Basically, it wraps DataflowRunner.run_pipeline and provide more test 
> actions. 
> However DataflowRunner.run renamed to run_pipeline in [this 
> commit|https://github.com/apache/beam/commit/8cf222d3db1188aff5432af548961fc670f97635],
>  but run function in TestDataflowRunner didn't change.
> We should change it accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3841) Python TestDataflowRunner should oeverride run_pipeline

2018-03-15 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401182#comment-16401182
 ] 

Mark Liu commented on BEAM-3841:


Fix PR [4856|https://github.com/apache/beam/pull/4856] is merged. 

> Python TestDataflowRunner should oeverride run_pipeline
> ---
>
> Key: BEAM-3841
> URL: https://issues.apache.org/jira/browse/BEAM-3841
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py]
>  is inherited from 
> [DataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py].
>  Basically, it wraps DataflowRunner.run_pipeline and provide more test 
> actions. 
> However DataflowRunner.run renamed to run_pipeline in [this 
> commit|https://github.com/apache/beam/commit/8cf222d3db1188aff5432af548961fc670f97635],
>  but run function in TestDataflowRunner didn't change.
> We should change it accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (bc9c97a -> ba2c648)

2018-03-15 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from bc9c97a  Merge pull request #4776: Add impulse override for read 
transforms
 add 9fa56ea  Replacing size() == 0 with isEmpty()
 new ba2c648  Merge pull request #4815: Replace size() == 0 with isEmpty()

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../beam/examples/cookbook/BigQueryTornadoesTest.java  |  2 +-
 .../beam/runners/core/triggers/TriggerStateMachine.java|  2 +-
 .../streaming/state/FlinkBroadcastStateInternals.java  |  2 +-
 .../streaming/state/FlinkKeyGroupStateInternals.java   |  2 +-
 .../runners/dataflow/DataflowPipelineTranslatorTest.java   |  2 +-
 .../runners/spark/translation/TransformTranslator.java |  2 +-
 .../java/org/apache/beam/sdk/coders/CoderRegistry.java |  2 +-
 .../src/main/java/org/apache/beam/sdk/io/TextSource.java   |  2 +-
 .../apache/beam/sdk/transforms/ApproximateQuantiles.java   |  2 +-
 .../main/java/org/apache/beam/sdk/transforms/Create.java   |  2 +-
 .../apache/beam/sdk/transforms/reflect/DoFnSignatures.java |  2 +-
 .../org/apache/beam/sdk/transforms/windowing/Trigger.java  |  2 +-
 .../beam/sdk/transforms/display/DisplayDataMatchers.java   |  2 +-
 .../operator/date/BeamSqlCurrentDateExpression.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java   |  2 +-
 .../apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/spanner/OrderedCode.java| 14 +++---
 .../src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java  |  2 +-
 19 files changed, 25 insertions(+), 25 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[beam] 01/01: Merge pull request #4815: Replace size() == 0 with isEmpty()

2018-03-15 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit ba2c6484e9eacaed047bf7c560a2a1e6cf37ab93
Merge: bc9c97a 9fa56ea
Author: Thomas Groh 
AuthorDate: Thu Mar 15 14:58:59 2018 -0700

Merge pull request #4815: Replace size() == 0 with isEmpty()

Replacing size() == 0 with isEmpty()

 .../beam/examples/cookbook/BigQueryTornadoesTest.java  |  2 +-
 .../beam/runners/core/triggers/TriggerStateMachine.java|  2 +-
 .../streaming/state/FlinkBroadcastStateInternals.java  |  2 +-
 .../streaming/state/FlinkKeyGroupStateInternals.java   |  2 +-
 .../runners/dataflow/DataflowPipelineTranslatorTest.java   |  2 +-
 .../runners/spark/translation/TransformTranslator.java |  2 +-
 .../java/org/apache/beam/sdk/coders/CoderRegistry.java |  2 +-
 .../src/main/java/org/apache/beam/sdk/io/TextSource.java   |  2 +-
 .../apache/beam/sdk/transforms/ApproximateQuantiles.java   |  2 +-
 .../main/java/org/apache/beam/sdk/transforms/Create.java   |  2 +-
 .../apache/beam/sdk/transforms/reflect/DoFnSignatures.java |  2 +-
 .../org/apache/beam/sdk/transforms/windowing/Trigger.java  |  2 +-
 .../beam/sdk/transforms/display/DisplayDataMatchers.java   |  2 +-
 .../operator/date/BeamSqlCurrentDateExpression.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java   |  2 +-
 .../apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.java|  2 +-
 .../org/apache/beam/sdk/io/gcp/spanner/OrderedCode.java| 14 +++---
 .../src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java  |  2 +-
 19 files changed, 25 insertions(+), 25 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


Jenkins build is back to normal : beam_PostCommit_Python_ValidatesRunner_Dataflow #1119

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3819) Add withLimit() option to KinesisIO

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3819?focusedWorklogId=80987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80987
 ]

ASF GitHub Bot logged work on BEAM-3819:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:37
Start Date: 15/Mar/18 21:37
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #4851: [BEAM-3819] Add 
withLimit() option to KinesisIO
URL: https://github.com/apache/beam/pull/4851#issuecomment-373531082
 
 
   Generally speaking, we try to avoid to expose user configuration in the IO 
(see 
https://beam.apache.org/contribute/ptransform-style-guide/#what-parameters-to-expose).
 I wonder if it's worth to expose `limit` to the user as it's more an IO 
constraint to me.
   
   Thoughts ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80987)
Time Spent: 50m  (was: 40m)

> Add withLimit() option to KinesisIO
> ---
>
> Key: BEAM-3819
> URL: https://issues.apache.org/jira/browse/BEAM-3819
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kinesis
>Reporter: Jean-Baptiste Onofré
>Assignee: Alexey Romanenko
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In some cases, the user might need to set the {{limit}} on the 
> {{SimplifiedKinesisClient}}, especially for performance reason, depending of 
> the number of records.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3819) Add withLimit() option to KinesisIO

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3819?focusedWorklogId=80985=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80985
 ]

ASF GitHub Bot logged work on BEAM-3819:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:29
Start Date: 15/Mar/18 21:29
Worklog Time Spent: 10m 
  Work Description: pawel-kaczmarczyk commented on issue #4851: [BEAM-3819] 
Add withLimit() option to KinesisIO
URL: https://github.com/apache/beam/pull/4851#issuecomment-373528960
 
 
   I like the idea and the code looks good. I would only consider changing the 
property name as `limit` is not much informative. Maybe `requestRecordsLimit`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80985)
Time Spent: 40m  (was: 0.5h)

> Add withLimit() option to KinesisIO
> ---
>
> Key: BEAM-3819
> URL: https://issues.apache.org/jira/browse/BEAM-3819
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kinesis
>Reporter: Jean-Baptiste Onofré
>Assignee: Alexey Romanenko
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In some cases, the user might need to set the {{limit}} on the 
> {{SimplifiedKinesisClient}}, especially for performance reason, depending of 
> the number of records.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=80984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80984
 ]

ASF GitHub Bot logged work on BEAM-3500:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:28
Start Date: 15/Mar/18 21:28
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #4461: 
[BEAM-3500] "Attach" JDBC connection to the bundle and add DataSourceFactory 
allowing full control of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461#discussion_r174938254
 
 

 ##
 File path: 
sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
 ##
 @@ -327,8 +332,25 @@ DataSource buildDatasource() throws Exception{
 if (getConnectionProperties() != null && 
getConnectionProperties().get() != null) {
   
basicDataSource.setConnectionProperties(getConnectionProperties().get());
 }
-return basicDataSource;
+current = basicDataSource;
   }
+
+  // wrapping the datasource as a pooling datasource
+  DataSourceConnectionFactory connectionFactory = new 
DataSourceConnectionFactory(current);
+  PoolableConnectionFactory poolableConnectionFactory =
+  new PoolableConnectionFactory(connectionFactory, null);
+  GenericObjectPoolConfig poolConfig = new GenericObjectPoolConfig();
+  poolConfig.setMaxTotal(1);
+  poolConfig.setMinIdle(0);
+  poolConfig.setMinEvictableIdleTimeMillis(1);
 
 Review comment:
   I don't know if it's worth it to verify the behavior automatically, I'm just 
asking to manually run a streaming pipeline with JdbcIO and confirm that it 
actually creates and releases connections when the pipeline is idle for some 
time (maybe enable logging in dbcp somewhere. Or at least confirm that the 
pipeline works at all for a few minutes).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80984)
Time Spent: 5.5h  (was: 5h 20m)

> JdbcIO: Improve connection management
> -
>
> Key: BEAM-3500
> URL: https://issues.apache.org/jira/browse/BEAM-3500
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in 
> {{@Teardown}} methods, which means that connection might stay opened for days 
> in streaming job case. Keeping single connection open for so long might be 
> very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in 
> {{executeBatch}} method and released when the batch is flushed. This will 
> allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource 
> itself. I am saying that because sink checks if DataSource implements 
> `Serializable` interface, which make it impossible to pass 
> BasicDataSource(used internally by sink) as it doesn’t implement this 
> interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>  DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5159

2018-03-15 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostRelease_NightlySnapshot #127

2018-03-15 Thread Apache Jenkins Server
See 


--
[...truncated 42.08 KB...]
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.1/ant-1.8.1.pom 
(8.8 kB at 275 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant-parent/1.8.1/ant-parent-1.8.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant-parent/1.8.1/ant-parent-1.8.1.pom
 (4.3 kB at 74 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-catalog/3.0.1/archetype-catalog-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-descriptor/3.0.1/archetype-descriptor-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-common/3.0.1/archetype-common-3.0.1.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/net/sourceforge/jchardet/jchardet/1.0/jchardet-1.0.jar
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-component-annotations/1.6/plexus-component-annotations-1.6.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-catalog/3.0.1/archetype-catalog-3.0.1.jar
 (19 kB at 142 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/dom4j/dom4j/1.6.1/dom4j-1.6.1.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-descriptor/3.0.1/archetype-descriptor-3.0.1.jar
 (24 kB at 127 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.0.b2/xml-apis-1.0.b2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/net/sourceforge/jchardet/jchardet/1.0/jchardet-1.0.jar
 (27 kB at 127 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/jdom/jdom/1.0/jdom-1.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-component-annotations/1.6/plexus-component-annotations-1.6.jar
 (4.3 kB at 20 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/3.0/maven-artifact-3.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.0.b2/xml-apis-1.0.b2.jar
 (109 kB at 734 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-settings-builder/3.0/maven-settings-builder-3.0.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/3.0/maven-artifact-3.0.jar
 (52 kB at 315 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-io/commons-io/2.2/commons-io-2.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-settings-builder/3.0/maven-settings-builder-3.0.jar
 (38 kB at 221 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-velocity/1.1.8/plexus-velocity-1.1.8.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/dom4j/dom4j/1.6.1/dom4j-1.6.1.jar (314 kB 
at 1.6 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/velocity/velocity/1.7/velocity-1.7.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/jdom/jdom/1.0/jdom-1.0.jar (153 kB at 871 
kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.4/commons-lang-2.4.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-velocity/1.1.8/plexus-velocity-1.1.8.jar
 (7.9 kB at 37 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/wagon/wagon-provider-api/2.8/wagon-provider-api-2.8.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-io/commons-io/2.2/commons-io-2.2.jar
 (174 kB at 751 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/groovy/groovy/1.8.3/groovy-1.8.3.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-common/3.0.1/archetype-common-3.0.1.jar
 (331 kB at 867 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/antlr/antlr/2.7.7/antlr-2.7.7.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.4/commons-lang-2.4.jar
 (262 kB at 919 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/asm/asm/3.2/asm-3.2.jar
[INFO] Downloaded from central: 

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=80983=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80983
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:17
Start Date: 15/Mar/18 21:17
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373525799
 
 
   Run Dataflow PostRelease


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80983)
Time Spent: 63h 50m  (was: 63h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 63h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3855) Add Go SDK support for protobuf coder

2018-03-15 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde updated BEAM-3855:

Description: This JIRA is for something functional. We might want to use 
the coder registry for a more general solution, when implemented. 

> Add Go SDK support for protobuf coder
> -
>
> Key: BEAM-3855
> URL: https://issues.apache.org/jira/browse/BEAM-3855
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Bill Neubauer
>Priority: Major
>
> This JIRA is for something functional. We might want to use the coder 
> registry for a more general solution, when implemented. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3855) Add Go SDK support for protobuf coder

2018-03-15 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-3855:
---

Assignee: Bill Neubauer  (was: Henning Rohde)

> Add Go SDK support for protobuf coder
> -
>
> Key: BEAM-3855
> URL: https://issues.apache.org/jira/browse/BEAM-3855
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Bill Neubauer
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=80982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80982
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:13
Start Date: 15/Mar/18 21:13
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4788: [BEAM-3339] Mobile 
gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#issuecomment-373524828
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80982)
Time Spent: 63h 40m  (was: 63.5h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 63h 40m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3860) Use side input for Go text/bigquery IO

2018-03-15 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-3860:
---

Assignee: (was: Henning Rohde)

> Use side input for Go text/bigquery IO
> --
>
> Key: BEAM-3860
> URL: https://issues.apache.org/jira/browse/BEAM-3860
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Minor
>
> We use a GBK instead of side input for these IO to work around the lack of 
> such support on non-direct runners. We should revert that when side input on 
> FnAPI is implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=80980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80980
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 15/Mar/18 21:01
Start Date: 15/Mar/18 21:01
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#4870: [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT t…
URL: https://github.com/apache/beam/pull/4870#discussion_r174931346
 
 

 ##
 File path: sdks/java/io/file-based-io-tests/pom.xml
 ##
 @@ -294,6 +294,12 @@
 ${apache.hadoop.version}
 runtime
 
+
+javax.xml.bind
+jaxb-api
 
 Review comment:
   Please add this to root level dependencyManagement so that we use the same 
version of jaxb-api across components.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80980)
Time Spent: 2.5h  (was: 2h 20m)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=80979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80979
 ]

ASF GitHub Bot logged work on BEAM-3817:


Author: ASF GitHub Bot
Created on: 15/Mar/18 20:44
Start Date: 15/Mar/18 20:44
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #4840: [BEAM-3817] Switch Go 
SDK BQ write to not use side input
URL: https://github.com/apache/beam/pull/4840#issuecomment-373516905
 
 
   @robertwb Done. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80979)
Time Spent: 50m  (was: 40m)

> Incompatible input encoding running Tornadoes example on dataflow
> -
>
> Key: BEAM-3817
> URL: https://issues.apache.org/jira/browse/BEAM-3817
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Braden Bassingthwaite
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Trying to run:
> go run tornadoes.go --output=:bbass.tornadoes --project  
> --runner dataflow --staging_location=gs://bbass/tornadoes 
> --worker_harness_container_image=gcr.io//beam/go
> Found here:
> [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go]
> I can run it locally but I get the error on Dataflow:
> (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible 
> input encoding. 
>  
> I built the worker_harness_container_image using:
> mvn clean install -DskipTests -Pbuild-containers 
> -Ddocker-repository-root=gcr.io//beam
>  
> Thanks!
>  
> Very excited to start using the golang beam sdk! great work!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3858) Data from JdbcIO.read() cannot pass to next transform on ApexRunner

2018-03-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401085#comment-16401085
 ] 

Jean-Baptiste Onofré commented on BEAM-3858:


Let me investigate to identify if the issue is in the IO or actually in the 
runner (in the DoFn support).

> Data from JdbcIO.read() cannot pass to next transform on ApexRunner
> ---
>
> Key: BEAM-3858
> URL: https://issues.apache.org/jira/browse/BEAM-3858
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc, runner-apex
>Affects Versions: 2.3.0
> Environment: ubuntu16.04
>Reporter: huangjianhuang
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> {code:java}
> public static void testJDBCRead(Pipeline pipeline) {
> System.out.println("in testJDBCRead()");
> pipeline.apply(JdbcIO.read()
> 
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
> "com.mysql.jdbc.Driver", 
> "jdbc:mysql://localhost:3307/libra")
> .withUsername("root")
> .withPassword("123456"))
> .withQuery("SELECT * FROM o_flow_account_login limit 3")
> .withCoder(StringUtf8Coder.of())
> .withRowMapper(new JdbcIO.RowMapper() {
> public String mapRow(ResultSet resultSet) throws Exception {
> System.out.println("maprow");
> return "tmp";
> }
> })
> )
> .apply(ParDo.of(new DoFn() {
> @ProcessElement
> public void process(ProcessContext context) {
> System.out.println("??");
> context.output(" ");
> }
> }));
> }
> {code}
> On DirectRunner or FlinkRunner, screen shows:
> {code:java}
> maprow
> maprow
> maprow
> ??
> ??
> ??
> {code}
> however on ApexRunner, screen only shows:
> {code:java}
> maprow
> maprow
> maprow
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=80977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80977
 ]

ASF GitHub Bot logged work on BEAM-3500:


Author: ASF GitHub Bot
Created on: 15/Mar/18 20:35
Start Date: 15/Mar/18 20:35
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on a change in pull request #4461: 
[BEAM-3500] "Attach" JDBC connection to the bundle and add DataSourceFactory 
allowing full control of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461#discussion_r174923818
 
 

 ##
 File path: 
sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
 ##
 @@ -327,8 +332,25 @@ DataSource buildDatasource() throws Exception{
 if (getConnectionProperties() != null && 
getConnectionProperties().get() != null) {
   
basicDataSource.setConnectionProperties(getConnectionProperties().get());
 }
-return basicDataSource;
+current = basicDataSource;
   }
+
+  // wrapping the datasource as a pooling datasource
+  DataSourceConnectionFactory connectionFactory = new 
DataSourceConnectionFactory(current);
+  PoolableConnectionFactory poolableConnectionFactory =
+  new PoolableConnectionFactory(connectionFactory, null);
+  GenericObjectPoolConfig poolConfig = new GenericObjectPoolConfig();
+  poolConfig.setMaxTotal(1);
+  poolConfig.setMinIdle(0);
+  poolConfig.setMinEvictableIdleTimeMillis(1);
 
 Review comment:
   Basically, you mean trying to implement a test to illustrate/test ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80977)
Time Spent: 5h 20m  (was: 5h 10m)

> JdbcIO: Improve connection management
> -
>
> Key: BEAM-3500
> URL: https://issues.apache.org/jira/browse/BEAM-3500
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in 
> {{@Teardown}} methods, which means that connection might stay opened for days 
> in streaming job case. Keeping single connection open for so long might be 
> very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in 
> {{executeBatch}} method and released when the batch is flushed. This will 
> allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource 
> itself. I am saying that because sink checks if DataSource implements 
> `Serializable` interface, which make it impossible to pass 
> BasicDataSource(used internally by sink) as it doesn’t implement this 
> interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>  DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #6213

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3860) Use side input for Go text/bigquery IO

2018-03-15 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-3860:
---

Assignee: Henning Rohde

> Use side input for Go text/bigquery IO
> --
>
> Key: BEAM-3860
> URL: https://issues.apache.org/jira/browse/BEAM-3860
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Minor
>
> We use a GBK instead of side input for these IO to work around the lack of 
> such support on non-direct runners. We should revert that when side input on 
> FnAPI is implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3860) Use side input for Go text/bigquery IO

2018-03-15 Thread Henning Rohde (JIRA)
Henning Rohde created BEAM-3860:
---

 Summary: Use side input for Go text/bigquery IO
 Key: BEAM-3860
 URL: https://issues.apache.org/jira/browse/BEAM-3860
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Henning Rohde


We use a GBK instead of side input for these IO to work around the lack of such 
support on non-direct runners. We should revert that when side input on FnAPI 
is implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1118

2018-03-15 Thread Apache Jenkins Server
See 


Changes:

[sidhom] Add Java bounded read overrides

--
[...truncated 123.38 KB...]
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Unkey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s12"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Unkey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_equal"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Match.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s13"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Match"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-15_12_17_41-16077882345294994391]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-15_12_17_41-16077882345294994391?project=apache-beam-testing
root: INFO: Job 2018-03-15_12_17_41-16077882345294994391 is in state 

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5158

2018-03-15 Thread Apache Jenkins Server
See 




[jira] [Comment Edited] (BEAM-3820) SolrIO: Allow changing batchSize for writes

2018-03-15 Thread Tim Robertson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400372#comment-16400372
 ] 

Tim Robertson edited comment on BEAM-3820 at 3/15/18 7:11 PM:
--

Thanks for engaging in this discussion [~jkff]. I'll try and provide context 
and justification here in the hope of changing your mind.

On this specific issue I am preparing pipelines to test ingestion speed of SOLR 
from Avro files using Beam on Spark. I am preparing this with configuration to 
run on different hardware - preproduction and production. The SOLR schema is 
approx 200 fields of text and I'm currently ingesting at 250M docs/hr on the 
preproduction cluster.

So far, the tuning I have performed:
h5. Spark

Obvious stuff, like number executors (JVMs), number tasks each runs 
concurrently, memory for each JVM.
h5. Parallelism

Because Beam does not allow a repartition of the underlying Spark RDD, the 
total parallelism of how the blocks are read from {{Avro}} can be tuned using 
e.g. {{--conf spark.default.parallelism=1000}}. Combined with the number of 
concurrent tasks you allow Spark to run controls the number of concurrent 
clients {{SOLR}} sees. Without this the Avro reader can present fewer tasks 
than you wish and failures can result in large amounts being retried 
(BEAM-3848).
h5. SOLR server tuning

Running with defaults can result in CPU resource starvation on underpowered 
machines so I've tuned the following:
 - Increasing {{solr.hdfs.nrtcachingdirectory.maxmergesizemb}} and 
{{solr.hdfs.nrtcachingdirectory.maxcachedmb}} to reduce File IO. Defaults can 
lead to very high load on the {{HDFS NameNode}}
 - Increasing {{ramBufferSizeMB}} to {{1024MB}} to further reduce File IO.
 - Reducing the {{maxIndexingThreads}} and relaxing the aggressiveness of the 
{{TieredMergePolicy}} to reduce CPU pressure on the server.
 - Heap, off-heap memory and the HDFS cache size

h5. The problem

>From my understanding I believe the {{SolrIO}} works as follows.
 - It accumulates a batch (hard coded to 1000) of {{SolrInputDocument}} which 
it then passes to the underlying {{SolrJ}} client which is an instance of a 
{{CloudSolrClient}}.
 - The {{CloudSolrClient}} will accept the batch, then negotiate with ZooKeeper 
to identify the current shard leader for each document based on the routing 
defined in the Solr collection (implicit routing in my case).
 - The batch of 1000 is then split into sub collections destined for the target 
shards. They are sent in parallel to the servers hosting each shard leader 
necessary
 - The leaders then writes to the SOLR TLog for the shard, forwards to a 
replica (if any are configured) before returning {{HTTP 200}} and confirming 
receipt of the batch
 - Once all confirmations are successfully received the {{CloudSolrClient}} 
returns to the {{SolrIO}}

If my understanding is correct it is important to note that the Beam batch of 
1000 docs is actually split up based on the configured number of SOLR shards, 
servers and based on the routing characteristics of the documents contained. 
The comments in the {{SolrIO}} suggest that the author might have assumed the 
batch would be a single http call to one server:
{quote}{{// 1000 for batch size is good enough in many cases,}}
 {{// ex: if document size is large, around 10KB, the request's size will be 
around 10MB}}
 {{// if document seize is small, around 1KB, the request's size will be around 
1MB}}
{quote}
I don't see how Beam could ever really optimise for this though. There are 
cases (like mine) where as the pipeline developer I know my routing strategy, 
network bandwidth and I know the target environments - I would run with 
different profiles for each environment to control the true batch sizes 
observed at the SOLR server.

 

 


was (Author: timrobertson100):
Thanks for engaging in this discussion [~jkff]. I'll try and provide context 
and justification here in the hope of changing your mind.

On this specific issue I am preparing pipelines to test ingestion speed of SOLR 
from Avro files using Beam on Spark. I am preparing this with configuration to 
run on different hardware - preproduction and production. The SOLR schema is 
approx 200 fields of text and I'm currently ingesting at 250M docs/hr on the 
preproduction cluster.

So far, the tuning I have performed:
h5. Spark

Obvious stuff, like number executors (JVMs), number tasks each runs 
concurrently, memory for each JVM.
h5. Parallelism

Because Beam does not allow a repartition of the underlying Spark RDD, the 
total parallelism of how the blocks are read from {{Avro}} can be tuned using 
e.g. {{--conf spark.default.parallelism=1000}}. Combined with the number of 
concurrent tasks you allow Spark to run controls the number of concurrent 
clients {{SOLR}} sees. Without this the Avro reader can present fewer tasks 
than you wish and failures can result in large amounts being 

[jira] [Work logged] (BEAM-3327) Add abstractions to manage Environment Instance lifecycles.

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3327?focusedWorklogId=80948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80948
 ]

ASF GitHub Bot logged work on BEAM-3327:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:02
Start Date: 15/Mar/18 19:02
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4751: [BEAM-3327] Implement 
simple Docker container manager
URL: https://github.com/apache/beam/pull/4751#issuecomment-373488164
 
 
   run java gradle precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80948)
Time Spent: 4h 40m  (was: 4.5h)

> Add abstractions to manage Environment Instance lifecycles.
> ---
>
> Key: BEAM-3327
> URL: https://issues.apache.org/jira/browse/BEAM-3327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Ben Sidhom
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This permits remote stage execution for arbitrary environments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80946
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r174896008
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/ImpulseInputFormat.java
 ##
 @@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers;
+
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.flink.api.common.io.DefaultInputSplitAssigner;
+import org.apache.flink.api.common.io.RichInputFormat;
+import org.apache.flink.api.common.io.statistics.BaseStatistics;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.io.GenericInputSplit;
+import org.apache.flink.core.io.InputSplitAssigner;
+
+/** Flink input format that implements impulses. */
+public class ImpulseInputFormat extends RichInputFormat, 
GenericInputSplit> {
+
+  // Whether the input format has remaining output that has not yet been read.
+  private boolean availableOutput = false;
+
+  public ImpulseInputFormat() {}
+
+  @Override
+  public void configure(Configuration configuration) {
+// Do nothing.
+  }
+
+  @Override
+  public BaseStatistics getStatistics(BaseStatistics baseStatistics) {
+return new BaseStatistics() {
+  @Override
+  public long getTotalInputSize() {
+return 1;
+  }
+
+  @Override
+  public long getNumberOfRecords() {
+return 1;
+  }
+
+  @Override
+  public float getAverageRecordWidth() {
+return 1;
+  }
+};
+  }
+
+  @Override
+  public GenericInputSplit[] createInputSplits(int numSplits) {
+// Always return a single split because only one global "impulse" will 
ever be sent.
+return new GenericInputSplit[]{new GenericInputSplit(1, 1)};
+  }
+
+  @Override
+  public InputSplitAssigner getInputSplitAssigner(GenericInputSplit[] 
genericInputSplits) {
+return new DefaultInputSplitAssigner(genericInputSplits);
+  }
+
+  @Override
+  public void open(GenericInputSplit genericInputSplit) {
+availableOutput = true;
+  }
+
+  @Override
+  public boolean reachedEnd() {
+return !availableOutput;
+  }
+
+  @Override
+  public WindowedValue nextRecord(WindowedValue windowedValue) 
{
+availableOutput = false;
 
 Review comment:
   This could be a `checkArgument` before you mutate the state (minor 
preference, no strong feelings)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80946)
Time Spent: 3h 10m  (was: 3h)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80939
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r173033660
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPipelineFuser.java
 ##
 @@ -72,10 +72,10 @@ private GreedyPipelineFuser(Pipeline p) {
   /**
* Fuses a {@link Pipeline} into a collection of {@link ExecutableStage}s.
*
-   * This fuser expects each PTransform to have exactly one input. This 
means that pipelines must
-   * use Impulse/ParDo transformations rather than read nodes. The utilities in
-   * {@link org.apache.beam.runners.core.construction.JavaReadViaImpulse} can 
be used to translate
-   * non-compliant pipelines.
+   * This fuser expects each PTransform which has no inputs to have an 
associated environment.
 
 Review comment:
   s/an associated environment/no associated environment


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80939)
Time Spent: 2.5h  (was: 2h 20m)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80945
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r174896407
 
 

 ##
 File path: sdks/java/core/build.gradle
 ##
 @@ -66,6 +66,7 @@ dependencies {
   shadow library.java.junit
   shadow "org.tukaani:xz:1.5"
   shadowTest project(":model:fn-execution").sourceSets.test.output
+  shadowTest project(path: ":runners:core-construction-java", configuration: 
"shadow")
 
 Review comment:
   What introduced this edge?
   
   This also can't be represented in maven, and right now that worries me 
(though hopefully less soon)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80945)
Time Spent: 3h  (was: 2h 50m)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80944
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r174893743
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPipelineFuser.java
 ##
 @@ -69,6 +69,15 @@ private GreedyPipelineFuser(Pipeline p) {
 fusePipeline(groupSiblings(rootConsumers));
   }
 
+  /**
+   * Fuses a {@link Pipeline} into a collection of {@link ExecutableStage}s.
+   *
+   * This fuser expects each ExecutableStage to have exactly one input. 
This means that pipelines
+   * must be rooted at Impulse, or other runner-executed primitive transforms, 
instead of primitive
+   * Read nodes. The utilities in
+   * {@link org.apache.beam.runners.core.construction.JavaReadViaImpulse} can 
be used to translate
+   * non-compliant pipelines.
 
 Review comment:
   This does kind of have an associated `TODO` for unbounded reads; 
https://issues.apache.org/jira/browse/BEAM-3859 is the (just-authored) issue to 
link against.
   
   'can be used to translate non-compliant pipelines -> can be used to convert 
bounded pipelines using the `Read` primitive.'


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80944)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80941
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r174893625
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPipelineFuser.java
 ##
 @@ -69,6 +69,15 @@ private GreedyPipelineFuser(Pipeline p) {
 fusePipeline(groupSiblings(rootConsumers));
   }
 
+  /**
+   * Fuses a {@link Pipeline} into a collection of {@link ExecutableStage}s.
 
 Review comment:
   `{@link ExecutableStage ExecutableStages}` is our normal style.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80941)
Time Spent: 2h 40m  (was: 2.5h)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80940
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r173033901
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPipelineFuser.java
 ##
 @@ -72,10 +72,10 @@ private GreedyPipelineFuser(Pipeline p) {
   /**
* Fuses a {@link Pipeline} into a collection of {@link ExecutableStage}s.
*
-   * This fuser expects each PTransform to have exactly one input. This 
means that pipelines must
-   * use Impulse/ParDo transformations rather than read nodes. The utilities in
-   * {@link org.apache.beam.runners.core.construction.JavaReadViaImpulse} can 
be used to translate
-   * non-compliant pipelines.
+   * This fuser expects each PTransform which has no inputs to have an 
associated environment.
+   * This means that pipelines must use Impulse/ParDo transformations rather 
than read nodes. The
 
 Review comment:
   "This means that pipelines must be rooted at Impulse, or other 
runner-executed primitive transforms, instead of primitive Read nodes."


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80940)
Time Spent: 2h 40m  (was: 2.5h)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-03-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=80942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80942
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 15/Mar/18 19:01
Start Date: 15/Mar/18 19:01
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4783: 
[BEAM-2898] Support Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#discussion_r174895596
 
 

 ##
 File path: 
runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/JavaReadViaImpulseTest.java
 ##
 @@ -89,14 +96,50 @@ public void testSplitSourceFn() {
   public void testReadFromSourceFn() {
 BoundedSource source = CountingSource.upTo(10L);
 PCollection sourcePC =
-(PCollection)
-p.apply(Create.of(source).withCoder(SerializableCoder.of((Class) 
BoundedSource.class)));
-PCollection elems = sourcePC.apply(ParDo.of(new 
ReadFromBoundedSourceFn<>()));
+  p.apply(Create.of(source)
+  .withCoder(new JavaReadViaImpulse.BoundedSourceCoder<>()));
+PCollection elems = sourcePC.apply(ParDo.of(new 
ReadFromBoundedSourceFn<>()))
+.setCoder(VarLongCoder.of());
 
 PAssert.that(elems).containsInAnyOrder(0L, 9L, 8L, 1L, 2L, 7L, 6L, 3L, 4L, 
5L);
 p.run();
   }
 
+  @Test
+  @Category(NeedsRunner.class)
+  public void testReadToImpulseOverride() {
+BoundedSource source = CountingSource.upTo(10L);
+// Use an explicit read transform to ensure the override is exercised.
+PCollection input = p.apply(Read.from(source));
+PAssert.that(input).containsInAnyOrder(0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L);
+
p.replaceAll(Collections.singletonList(JavaReadViaImpulse.boundedOverride()));
+p.traverseTopologically(new Pipeline.PipelineVisitor() {
 
 Review comment:
   extend PipelineVisitor.Defaults, and get rid of `enterPipeline`, 
`visitValue`, and `leave*`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 80942)
Time Spent: 2h 50m  (was: 2h 40m)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >