date:20180820

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136392
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 21/Aug/18 05:46
Start Date: 21/Aug/18 05:46
Worklog Time Spent: 10m 
  Work Description: vectorijk edited a comment on issue #5926: [BEAM-4723] 
[SQL] Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414556642
 
 
   rebased onto master. thanks! @yifanzou 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136392)
Time Spent: 6h 50m  (was: 6h 40m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136391
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 21/Aug/18 05:45
Start Date: 21/Aug/18 05:45
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414556642
 
 
   rebased onto master


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136391)
Time Spent: 6h 40m  (was: 6.5h)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-4696) Execute Jenkins website tests in a Docker container

2018-08-20 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586827#comment-16586827
 ] 

Udi Meiri commented on BEAM-4696:
-

I've managed to run a local Jenkins in a docker container (using [~Ardagan]'s 
dockerized-jenkins).
I managed to set it up so that Jenkins has authorization to launch Docker jobs 
on the host Docker.
Right now, it's managed to start the ruby:2.5 image, but it doesn't have access 
to /tmp on the Jenkins containers:

  /bin/sh: 0: Can't open /tmp/jenkins6814128182425397587.sh 

Current incomplete state: 
https://github.com/apache/beam/commit/6fb3248792a4c8a9c10868df222b8301a13a3286

> Execute Jenkins website tests in a Docker container
> ---
>
> Key: BEAM-4696
> URL: https://issues.apache.org/jira/browse/BEAM-4696
> Project: Beam
>  Issue Type: Improvement
>  Components: testing, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>
> Currently, the website tests run in a vanilla Linux environment, which 
> require a prerequisite step to install Ruby. The install script is flaky and 
> adds extra time to the job.
> Instead, we should run the website pre-commits inside the pre-built ruby/2.5 
> docker image so that we don't need to worry about installing extra 
> dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1306

2018-08-20 Thread Apache Jenkins Server

See 


--
[...truncated 20.31 MB...]
INFO: 2018-08-21T01:19:49.956Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.001Z: Elided trivial flatten 
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.049Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.091Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.141Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.191Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.231Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.270Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.321Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.367Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.406Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
Aug 21, 2018 1:19:57 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-21T01:19:50.452Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Partial

Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1305

2018-08-20 Thread Apache Jenkins Server

See

Jenkins build is back to normal : beam_PreCommit_Java_Cron #251

2018-08-20 Thread Apache Jenkins Server

See

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136363
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 21/Aug/18 00:16
Start Date: 21/Aug/18 00:16
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6255: [BEAM-5186] Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255#discussion_r211445270
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_test.py
 ##
 @@ -57,6 +58,21 @@ def test_row_as_dict(self):
 test_value = {'s': 'abc', 'i': 123, 'f': 123.456, 'b': True}
 self.assertEqual(test_value, coder.decode(coder.encode(test_value)))
 
+  def test_decimal_in_row_as_dict(self):
 
 Review comment:
   Please also try out BQ jobs that read from/write to a BQ column of NUMERIC 
type with Dataflow/Direct runners (two runners operate in slightly different 
paths).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136363)
Time Spent: 0.5h  (was: 20m)

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136359
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 21/Aug/18 00:16
Start Date: 21/Aug/18 00:16
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6255: [BEAM-5186] Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255#discussion_r211438844
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery.py
 ##
 @@ -160,6 +161,13 @@
 MAX_RETRIES = 3
 
 
+def decimal_default_encoder(obj):
 
 Review comment:
   Just call this 'default_encoder'


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136359)

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136360
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 21/Aug/18 00:16
Start Date: 21/Aug/18 00:16
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6255: [BEAM-5186] Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255#discussion_r211445805
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery.py
 ##
 @@ -160,6 +161,13 @@
 MAX_RETRIES = 3
 
 
+def decimal_default_encoder(obj):
+  if isinstance(obj, decimal.Decimal):
 
 Review comment:
   I think you also need to support following 
(bigquery.BigQueryWrapper._convert_cell_value_to_dict) to support DirectRunner.
   
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L1122


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136360)
Time Spent: 20m  (was: 10m)

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136362
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 21/Aug/18 00:16
Start Date: 21/Aug/18 00:16
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6255: [BEAM-5186] Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255#discussion_r211439858
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery.py
 ##
 @@ -1156,6 +1167,8 @@ def _convert_cell_value_to_dict(self, value, field):
   # when querying, the repeated and/or record fields are flattened
   # unless we pass the flatten_results flag as False to the source
   return self.convert_row_to_dict(value, field)
+elif field.type == 'NUMERIC':
+  return decimal.Decimal(value)
 
 Review comment:
   Please update the documentation in bigquery.py to include information 
regarding NUMERIC type (for sync and source).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136362)
Time Spent: 0.5h  (was: 20m)

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136361
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 21/Aug/18 00:16
Start Date: 21/Aug/18 00:16
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6255: [BEAM-5186] Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255#discussion_r211439247
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery.py
 ##
 @@ -160,6 +161,13 @@
 MAX_RETRIES = 3
 
 
+def decimal_default_encoder(obj):
+  if isinstance(obj, decimal.Decimal):
+return str(obj)
 
 Review comment:
   Do these string values properly get written to a BQ column of NUMERIC type ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136361)
Time Spent: 0.5h  (was: 20m)

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1304

2018-08-20 Thread Apache Jenkins Server

See 


Changes:

[thw] [BEAM-5168] Redirect Flink jobserver commons-logging to slf4j.

--
[...truncated 20.22 MB...]
INFO: 2018-08-20T23:45:57.461Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.497Z: Elided trivial flatten 
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.544Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.579Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.625Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.669Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.706Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.753Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.798Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.836Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.875Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
Aug 20, 2018 11:45:59 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T23:45:57.907Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Schema

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136356
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:47
Start Date: 20/Aug/18 23:47
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254#issuecomment-414500264
 
 
   Run Docker Image Cleanup beam6


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136356)
Time Spent: 3h 40m  (was: 3.5h)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5151) Add EXTERNAL to CREATE TABLE statement

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5151?focusedWorklogId=136355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136355
 ]

ASF GitHub Bot logged work on BEAM-5151:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:46
Start Date: 20/Aug/18 23:46
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6252: [BEAM-5151][SQL] 
create external table
URL: https://github.com/apache/beam/pull/6252#issuecomment-414500091
 
 
   run java postcommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136355)
Time Spent: 0.5h  (was: 20m)

> Add EXTERNAL to CREATE TABLE statement
> --
>
> Key: BEAM-5151
> URL: https://issues.apache.org/jira/browse/BEAM-5151
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> BeamSQL allows [CREATE 
> TABLE|https://beam.apache.org/documentation/dsls/sql/create-table/] 
> statements to register virtual tables from external storage systems (e.g. 
> BigQuery). 
>  
> BeamSQL is not a storage system, so any table registered by "CREATE TABLE" 
> statement is essentially equivalent to be registered by "CREATE EXTERNAL 
> TABLE", which requires the user to provide a LOCATION and BeamSQL will 
> register the table outside of current execution environment based on LOCATION.
>  
> So I propose to add EXTERNAL keyword to "CREATE TABLE" in BeamSQL to help 
> users understand they are registering tables, and BeamSQL does not create non 
> existing tables by running CREATE TABLE (at least on some storage systems, if 
> not all). 
>  
> We can make the EXTERNAL keyword either required or optional.
>  
> If we make the EXTERNAL keyword required:
>  
> Pros:
> a. We can get rid of the registering table semantic on CREATE TABLE. 
> b, We keep the room that we could add CREATE TABLE back in the future if we 
> want CREATE TABLE to create, rather than not only register tables in BeamSQL. 
>  
> Cons:
> 1. CREATE TABLE syntax will not be supported so existing BeamSQL pipelines 
> which has CREATE TABLE require changes.
> 2. It's required to type tedious EXTERNAL keyword every time, especially in 
> SQL Shell.
>  
> If we make the EXTERNAL keyword optional, we will have reversed pros and cons 
> above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5151) Add EXTERNAL to CREATE TABLE statement

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5151?focusedWorklogId=136354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136354
 ]

ASF GitHub Bot logged work on BEAM-5151:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:44
Start Date: 20/Aug/18 23:44
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6252: [BEAM-5151][SQL] 
create external table
URL: https://github.com/apache/beam/pull/6252#issuecomment-414499797
 
 
   updated tests to reflect this change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136354)
Time Spent: 20m  (was: 10m)

> Add EXTERNAL to CREATE TABLE statement
> --
>
> Key: BEAM-5151
> URL: https://issues.apache.org/jira/browse/BEAM-5151
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> BeamSQL allows [CREATE 
> TABLE|https://beam.apache.org/documentation/dsls/sql/create-table/] 
> statements to register virtual tables from external storage systems (e.g. 
> BigQuery). 
>  
> BeamSQL is not a storage system, so any table registered by "CREATE TABLE" 
> statement is essentially equivalent to be registered by "CREATE EXTERNAL 
> TABLE", which requires the user to provide a LOCATION and BeamSQL will 
> register the table outside of current execution environment based on LOCATION.
>  
> So I propose to add EXTERNAL keyword to "CREATE TABLE" in BeamSQL to help 
> users understand they are registering tables, and BeamSQL does not create non 
> existing tables by running CREATE TABLE (at least on some storage systems, if 
> not all). 
>  
> We can make the EXTERNAL keyword either required or optional.
>  
> If we make the EXTERNAL keyword required:
>  
> Pros:
> a. We can get rid of the registering table semantic on CREATE TABLE. 
> b, We keep the room that we could add CREATE TABLE back in the future if we 
> want CREATE TABLE to create, rather than not only register tables in BeamSQL. 
>  
> Cons:
> 1. CREATE TABLE syntax will not be supported so existing BeamSQL pipelines 
> which has CREATE TABLE require changes.
> 2. It's required to type tedious EXTERNAL keyword every time, especially in 
> SQL Shell.
>  
> If we make the EXTERNAL keyword optional, we will have reversed pros and cons 
> above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136353=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136353
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:43
Start Date: 20/Aug/18 23:43
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice that right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keeping everything as what it is in Beam repo 
now. Then once BeamSQL catches `IllegalArgumentException`, it fetches 
`PropertyDescriptors` check if need to customize error messages, which has 
already been a very similar to the special exception approach (catch exception 
and customize error messages).
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136353)
Time Spent: 7h 40m  (was: 7.5h)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136350
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:25
Start Date: 20/Aug/18 23:25
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice that right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which has already been a very similar to the 
special exception approach (catch exception and customize error messages).
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136350)
Time Spent: 7.5h  (was: 7h 20m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136351
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:25
Start Date: 20/Aug/18 23:25
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254#issuecomment-414496176
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136351)
Time Spent: 3.5h  (was: 3h 20m)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136349
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:24
Start Date: 20/Aug/18 23:24
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice that right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which has already been a very similar to the 
special exception approach (catch exception and custom error message).
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136349)
Time Spent: 7h 20m  (was: 7h 10m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136348
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:24
Start Date: 20/Aug/18 23:24
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice that right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which has already been a very similar to the 
special exception approach.
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136348)
Time Spent: 7h 10m  (was: 7h)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136346
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:23
Start Date: 20/Aug/18 23:23
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice that right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which is already a very similar to the 
special exception approach.
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136346)
Time Spent: 7h  (was: 6h 50m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136345
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:23
Start Date: 20/Aug/18 23:23
Worklog Time Spent: 10m 
  Work Description: amaliujia edited a comment on issue #6216: [BEAM-5141] 
Improve error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of options that need to be changed when user changes `runner` by 
`SET`. It is a design choice and right now BeamSQL uses the lazy validation 
(e.g. validate at `SELECT`) to handle these cases, and another design choice is 
to use RESET to clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done in `PipelineOptionsFactory`. The concern here is we 
have to remove [checks related to 
`PropertyDescriptors`](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1509)
 in `PipelineOptionsFactory` (otherwise it will fail as what it is now and it 
will fail before we get `PropertyDescriptors`). Then where to implement the 
validation for other places which uses `PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which is already a very similar to the 
special exception approach.
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136345)
Time Spent: 6h 50m  (was: 6h 40m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5186?focusedWorklogId=136343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136343
 ]

ASF GitHub Bot logged work on BEAM-5186:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:21
Start Date: 20/Aug/18 23:21
Worklog Time Spent: 10m 
  Work Description: pabloem opened a new pull request #6255: [BEAM-5186] 
Adding numeric support to BQ Sink
URL: https://github.com/apache/beam/pull/6255
 
 
   r: @chamikaramj 
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136343)
Time Spent: 10m
Remaining Estimate: 0h

> Support for NUMERIC data type in BQ
> ---
>
> Key: BEAM-5186
> URL: https://issues.apache.org/jira/browse/BEAM-5186
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (BEAM-5168) Flink jobserver logging should be redirected to slf4j

2018-08-20 Thread Thomas Weise (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved BEAM-5168.

   Resolution: Fixed
Fix Version/s: 2.7.0

> Flink jobserver logging should be redirected to slf4j
> -
>
> Key: BEAM-5168
> URL: https://issues.apache.org/jira/browse/BEAM-5168
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
> Fix For: 2.7.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently commons-logging goes nowhere, which makes certain issues really 
> hard to debug (http client uses it, for example). This can be fixed by using 
> the slf4j bridge.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5110) Reconile Flink JVM singleton management with deployment

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5110?focusedWorklogId=136341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136341
 ]

ASF GitHub Bot logged work on BEAM-5110:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:20
Start Date: 20/Aug/18 23:20
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6189: [BEAM-5110] 
Explicitly count the references for BatchFlinkExecutableStageContext …
URL: https://github.com/apache/beam/pull/6189#issuecomment-414495156
 
 
   Shall I take it in a separate PR to limit the scope.
   I also think we need more discussion on how to decide number of SDKHarness 
in general. 
   I worked on an internal feature to do so for python streaming and I totally 
agree with having multiple python SDK for streaming and batch use cases.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136341)
Time Spent: 2h 40m  (was: 2.5h)

> Reconile Flink JVM singleton management with deployment
> ---
>
> Key: BEAM-5110
> URL: https://issues.apache.org/jira/browse/BEAM-5110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> [~angoenka] noticed through debugging that multiple instances of 
> BatchFlinkExecutableStageContext.BatchFactory are loaded for a given job when 
> executing in standalone cluster mode. This context factory is responsible for 
> maintaining singleton state across a TaskManager (JVM) in order to share SDK 
> Environments across workers in a given job. The multiple-loading breaks 
> singleton semantics and results in an indeterminate number of Environments 
> being created.
> It turns out that the [Flink classloading 
> mechanism|https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/debugging_classloading.html]
>  is determined by deployment mode. Note that "user code" as referenced by 
> this link is actually the Flink job server jar. Actual end-user code lives 
> inside of the SDK Environment and uploaded artifacts.
> In order to maintain singletons without resorting to IPC (for example, using 
> file locks and/or additional gRPC servers), we need to force non-dynamic 
> classloading. For example, this happens when jobs are submitted to YARN for 
> one-off deployments via `flink run`. However, connecting to an existing 
> (Flink standalone) deployment results in dynamic classloading.
> We should investigate this behavior and either document (and attempt to 
> enforce) deployment modes that are consistent with our requirements, or (if 
> possible) create a custom classloader that enforces singleton loading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136342
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:20
Start Date: 20/Aug/18 23:20
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6216: [BEAM-5141] Improve 
error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414495185
 
 
   Did some exploration work for that validation when SET happens.
   
   Doing the validation at `SET` could fail a `SET` with inappropriate option 
with existing error message. The problem that approach faces is when and how we 
validate a set of option that needs change when user change `runner` by `SET`. 
It is a design choice and right now BeamSQL uses the lazy validation (e.g. 
validate at `SELECT`) to handle these cases, and another design choice is to 
use RESET clear inappropriate options.
   
   Having the context in mind, we need remind users to `RESET` options when we 
detect the wrong options. It is possible to return `PropertyDescriptors` after 
majority validation is done. The concern here is we have to remove checks 
related `PropertyDescriptors` in `PipelineOptionsFactory` (otherwise it will 
fail as what it is now and it will fail before we get `PropertyDescriptors`). 
Then where to implement the validation for other places which uses 
`PipelineOptionsFactory`?
   
   What could be done is still keep everything as what it is now. Once BeamSQL 
catches `IllegalArgumentException`, it fetches `PropertyDescriptors` check if 
need to customize error messages, which is already a very similar to the 
special exception approach.
   
   I think catching the special exception is better than catching 
`IllegalArgumentException` and check `PropertyDescriptors`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136342)
Time Spent: 6h 40m  (was: 6.5h)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (BEAM-5169) Add options for master URL and log level to Flink jobserver runShadow task

2018-08-20 Thread Thomas Weise (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved BEAM-5169.

   Resolution: Fixed
Fix Version/s: 2.7.0

> Add options for master URL and log level to Flink jobserver runShadow task
> --
>
> Key: BEAM-5169
> URL: https://issues.apache.org/jira/browse/BEAM-5169
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
> Fix For: 2.7.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (BEAM-5186) Support for NUMERIC data type in BQ

2018-08-20 Thread Pablo Estrada (JIRA)

Pablo Estrada created BEAM-5186:
---

 Summary: Support for NUMERIC data type in BQ
 Key: BEAM-5186
 URL: https://issues.apache.org/jira/browse/BEAM-5186
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Pablo Estrada
Assignee: Pablo Estrada






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5110) Reconile Flink JVM singleton management with deployment

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5110?focusedWorklogId=136339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136339
 ]

ASF GitHub Bot logged work on BEAM-5110:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:12
Start Date: 20/Aug/18 23:12
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6189: [BEAM-5110] Explicitly 
count the references for BatchFlinkExecutableStageContext …
URL: https://github.com/apache/beam/pull/6189#issuecomment-414493523
 
 
   Please resolve merge conflict. Do you think we should incorporate the count 
configuration option since the portable Flink runner is effectively broken 
right now when used with Python SDK?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136339)
Time Spent: 2.5h  (was: 2h 20m)

> Reconile Flink JVM singleton management with deployment
> ---
>
> Key: BEAM-5110
> URL: https://issues.apache.org/jira/browse/BEAM-5110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> [~angoenka] noticed through debugging that multiple instances of 
> BatchFlinkExecutableStageContext.BatchFactory are loaded for a given job when 
> executing in standalone cluster mode. This context factory is responsible for 
> maintaining singleton state across a TaskManager (JVM) in order to share SDK 
> Environments across workers in a given job. The multiple-loading breaks 
> singleton semantics and results in an indeterminate number of Environments 
> being created.
> It turns out that the [Flink classloading 
> mechanism|https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/debugging_classloading.html]
>  is determined by deployment mode. Note that "user code" as referenced by 
> this link is actually the Flink job server jar. Actual end-user code lives 
> inside of the SDK Environment and uploaded artifacts.
> In order to maintain singletons without resorting to IPC (for example, using 
> file locks and/or additional gRPC servers), we need to force non-dynamic 
> classloading. For example, this happens when jobs are submitted to YARN for 
> one-off deployments via `flink run`. However, connecting to an existing 
> (Flink standalone) deployment results in dynamic classloading.
> We should investigate this behavior and either document (and attempt to 
> enforce) deployment modes that are consistent with our requirements, or (if 
> possible) create a custom classloader that enforces singleton loading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5169) Add options for master URL and log level to Flink jobserver runShadow task

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5169?focusedWorklogId=136337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136337
 ]

ASF GitHub Bot logged work on BEAM-5169:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:08
Start Date: 20/Aug/18 23:08
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6249: [BEAM-5169] Add 
options for master URL and log level to Flink jobserver runShadow task.
URL: https://github.com/apache/beam/pull/6249
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/runners/flink/job-server/build.gradle 
b/runners/flink/job-server/build.gradle
index 4059d7303fb..b38b209762e 100644
--- a/runners/flink/job-server/build.gradle
+++ b/runners/flink/job-server/build.gradle
@@ -55,9 +55,13 @@ runShadow {
   args = ["--job-host=${jobHost}", "--artifacts-dir=${artifactsDir}"]
   if (cleanArtifactsPerJob)
 args += ["--clean-artifacts-per-job"]
+  if (project.hasProperty("flinkMasterUrl"))
+args += ["--flink-master-url=${project.property('flinkMasterUrl')}"]
 
   // Enable remote debugging.
   jvmArgs = ["-Xdebug", 
"-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"]
+  if (project.hasProperty("logLevel"))
+jvmArgs += 
["-Dorg.slf4j.simpleLogger.defaultLogLevel=${project.property('logLevel')}"]
 }
 
 createPortableValidatesRunnerTask(jobServerDriver: 
"org.apache.beam.runners.flink.FlinkJobServerDriver", 
testClasspathConfiguration: configurations.validatesPortableRunner)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136337)
Time Spent: 40m  (was: 0.5h)

> Add options for master URL and log level to Flink jobserver runShadow task
> --
>
> Key: BEAM-5169
> URL: https://issues.apache.org/jira/browse/BEAM-5169
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] branch master updated (31314f0 -> 20aff13)

2018-08-20 Thread thw

This is an automated email from the ASF dual-hosted git repository.

thw pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 31314f0  Merge pull request #6248: [BEAM-5168] Redirect Flink 
jobserver commons-logging to slf4j.
 add eb98549  [BEAM-5169] Add options for master URL and log level to Flink 
jobserver runShadow task.
 new 20aff13  Merge pull request #6249: [BEAM-5169] Add options for master 
URL and log level to Flink jobserver runShadow task.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 runners/flink/job-server/build.gradle | 4 
 1 file changed, 4 insertions(+)

[beam] 01/01: Merge pull request #6249: [BEAM-5169] Add options for master URL and log level to Flink jobserver runShadow task.

2018-08-20 Thread thw

This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 20aff13d36b4ac7d9b9a29fd9f6fb3e80a84214a
Merge: 31314f0 eb98549
Author: Thomas Weise 
AuthorDate: Mon Aug 20 16:07:58 2018 -0700

Merge pull request #6249: [BEAM-5169] Add options for master URL and log 
level to Flink jobserver runShadow task.

 runners/flink/job-server/build.gradle | 4 
 1 file changed, 4 insertions(+)

diff --cc runners/flink/job-server/build.gradle
index 0da1b89,b38b209..5e1e01a
--- a/runners/flink/job-server/build.gradle
+++ b/runners/flink/job-server/build.gradle
@@@ -64,10 -60,8 +66,12 @@@ runShadow 
  
// Enable remote debugging.
jvmArgs = ["-Xdebug", 
"-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"]
+   if (project.hasProperty("logLevel"))
+ jvmArgs += 
["-Dorg.slf4j.simpleLogger.defaultLogLevel=${project.property('logLevel')}"]
  }
  
 -createPortableValidatesRunnerTask(jobServerDriver: 
"org.apache.beam.runners.flink.FlinkJobServerDriver", 
testClasspathConfiguration: configurations.validatesPortableRunner)
 +createPortableValidatesRunnerTask(
 +jobServerDriver: "org.apache.beam.runners.flink.FlinkJobServerDriver",
 +jobServerConfig: "--clean-artifacts-per-job",
 +testClasspathConfiguration: configurations.validatesPortableRunner
 +)

[jira] [Work logged] (BEAM-5168) Flink jobserver logging should be redirected to slf4j

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5168?focusedWorklogId=136335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136335
 ]

ASF GitHub Bot logged work on BEAM-5168:


Author: ASF GitHub Bot
Created on: 20/Aug/18 23:03
Start Date: 20/Aug/18 23:03
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6248: [BEAM-5168] Redirect 
Flink jobserver commons-logging to slf4j.
URL: https://github.com/apache/beam/pull/6248
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/runners/flink/job-server/build.gradle 
b/runners/flink/job-server/build.gradle
index 4059d7303fb..a6ee49c1c42 100644
--- a/runners/flink/job-server/build.gradle
+++ b/runners/flink/job-server/build.gradle
@@ -35,8 +35,14 @@ configurations {
   validatesPortableRunner
 }
 
+configurations.all {
+// replace commons logging with the jcl-over-slf4j bridge
+exclude group: "commons-logging", module: "commons-logging"
+}
+
 dependencies {
   compile project(path: ":beam-runners-flink_2.11", configuration: "shadow")
+  compile group: "org.slf4j", name: "jcl-over-slf4j", version: 
dependencies.create(project.library.java.slf4j_api).getVersion()
   validatesPortableRunner project(path: ":beam-runners-flink_2.11", 
configuration: "shadowTest")
   validatesPortableRunner project(path: ":beam-sdks-java-core", configuration: 
"shadowTest")
   validatesPortableRunner project(path: ":beam-runners-core-java", 
configuration: "shadowTest")


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136335)
Time Spent: 40m  (was: 0.5h)

> Flink jobserver logging should be redirected to slf4j
> -
>
> Key: BEAM-5168
> URL: https://issues.apache.org/jira/browse/BEAM-5168
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently commons-logging goes nowhere, which makes certain issues really 
> hard to debug (http client uses it, for example). This can be fixed by using 
> the slf4j bridge.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1303

2018-08-20 Thread Apache Jenkins Server

See 


Changes:

[ryan.blake.williams] remove outdated FileSystem deletion docs

--
[...truncated 19.61 MB...]
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.413Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.461Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.510Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.557Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.606Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.649Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.684Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.721Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.770Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.816Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.859Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
Aug 20, 2018 10:59:51 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:59:46.901Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/ParDo(ToIsmMetadataRecordForKey) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForKeys/Read
Aug 20,

[beam] branch master updated (9731467 -> 31314f0)

2018-08-20 Thread thw

This is an automated email from the ASF dual-hosted git repository.

thw pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 9731467  Merge pull request #6071 from ryan-williams/fs-docs
 add e51045e  [BEAM-5168] Redirect Flink jobserver commons-logging to slf4j.
 new 31314f0  Merge pull request #6248: [BEAM-5168] Redirect Flink 
jobserver commons-logging to slf4j.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 runners/flink/job-server/build.gradle | 6 ++
 1 file changed, 6 insertions(+)

[beam] 01/01: Merge pull request #6248: [BEAM-5168] Redirect Flink jobserver commons-logging to slf4j.

2018-08-20 Thread thw

This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 31314f09b9e4555b28b5b139621a7bc94a9f9873
Merge: 9731467 e51045e
Author: Thomas Weise 
AuthorDate: Mon Aug 20 16:03:18 2018 -0700

Merge pull request #6248: [BEAM-5168] Redirect Flink jobserver 
commons-logging to slf4j.

 runners/flink/job-server/build.gradle | 6 ++
 1 file changed, 6 insertions(+)

[jira] [Work logged] (BEAM-3921) Scripting extension based on Java Scripting API (JSR-223)

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3921?focusedWorklogId=136332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136332
 ]

ASF GitHub Bot logged work on BEAM-3921:


Author: ASF GitHub Bot
Created on: 20/Aug/18 22:57
Start Date: 20/Aug/18 22:57
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #4944:  [BEAM-3921] 
Scripting extension based on Java Scripting API (JSR-223)
URL: https://github.com/apache/beam/pull/4944#issuecomment-414490387
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136332)
Time Spent: 1h 50m  (was: 1h 40m)

> Scripting extension based on Java Scripting API (JSR-223)
> -
>
> Key: BEAM-3921
> URL: https://issues.apache.org/jira/browse/BEAM-3921
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Affects Versions: 2.5.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> An extension with transforms that package the Java Scripting API (JSR-223) 
> [1] to allow users to specialize some transforms via a scripting language. It 
> supports ValueProviders so users can template their scripts also in Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5180) Broken FileResultCoder via parseSchema change

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5180?focusedWorklogId=136327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136327
 ]

ASF GitHub Bot logged work on BEAM-5180:


Author: ASF GitHub Bot
Created on: 20/Aug/18 22:48
Start Date: 20/Aug/18 22:48
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6251: [BEAM-5180] Relax 
back restriction on parsing file scheme
URL: https://github.com/apache/beam/pull/6251#issuecomment-414488403
 
 
   HDFS file names are expected to start with "hdfs://" so I would fix the file 
name instead of regex.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136327)
Time Spent: 40m  (was: 0.5h)

> Broken FileResultCoder via parseSchema change
> -
>
> Key: BEAM-5180
> URL: https://issues.apache.org/jira/browse/BEAM-5180
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Jozef Vilcek
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Recently this commit
> [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384]
> introduced more strict schema parsing which is breaking the contract between 
> _FileResultCoder_ and _FileSystems.matchNewResource()_.
> Coder takes _ResourceId_ and serialize it via `_toString_` methods and then 
> relies on filesystem being able to parse it back again. Having strict 
> _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for 
> _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_`
> I guess the _ResourceIdCoder_ is suffering the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5033) Define and publish support story for Beam releases

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5033?focusedWorklogId=136308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136308
 ]

ASF GitHub Bot logged work on BEAM-5033:


Author: ASF GitHub Bot
Created on: 20/Aug/18 22:05
Start Date: 20/Aug/18 22:05
Worklog Time Spent: 10m 
  Work Description: asfgit closed pull request #539: [BEAM-5033] Update to 
LTS wording to remove every Nth release clause.
URL: https://github.com/apache/beam-site/pull/539
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/community/policies.md b/src/community/policies.md
index a553e76cc4..2b96a7f6d9 100644
--- a/src/community/policies.md
+++ b/src/community/policies.md
@@ -26,7 +26,7 @@ This page contains a list of major policies agreed upon by 
the Apache Beam commu
 
 Apache Beam makes minor releases every 6 weeks. Apache Beam has a 
[calendar](https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com)
 for cutting the next release branch. After a release branch is cut, the 
community works quickly to finalize that release.
 
-Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. Starting with the 2.7.0 release, every fourth release 
will be a LTS release. LTS releases receive patches to fix major issues for 12 
months, starting from the release's initial release date. LTS releases are 
considered deprecated after 12 months. Non-LTS releases do not receive patches 
and are considered deprecated immediately after the next following minor 
release. We encourage you to update early and often; do not wait until the 
deprecation date of the version you are using.
+Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. LTS releases receive patches to fix major issues for 12 
months, starting from the release's initial release date. There will be at 
least one new LTS release in a 12 month period, and LTS releases are considered 
deprecated after 12 months. The community will mark a release as a LTS release 
based on various factors, such as the number of LTS releases currently in 
flight and whether the accumulated feature set since the last LTS provides 
significant upgrade value. Non-LTS releases do not receive patches and are 
considered deprecated immediately after the next following minor release. We 
encourage you to update early and often; do not wait until the deprecation date 
of the version you are using.
 
 It is up to the Apache Beam community to decide whether an identified issue is 
a major issue that warrants a patch release. Some examples of major issues are 
high severity security issues and high risk data integrity issues.
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136308)
Time Spent: 1.5h  (was: 1h 20m)

> Define and publish support story for Beam releases
> --
>
> Key: BEAM-5033
> URL: https://issues.apache.org/jira/browse/BEAM-5033
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Assignee: Ahmet Altay
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Propose, define, and publish a support story for Beam releases, clarifying 
> how long a specific release is expected to be eligible for security or 
> correctness patches.
>  
> A user could naturally infer today that only the latest released version is 
> eligible for such patches. Since we release every six weeks, this could be an 
> opportunity to establish long-term support policies for selected releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-5157) Improve user friendliness of using SET

2018-08-20 Thread Rui Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586594#comment-16586594
 ] 

Rui Wang commented on BEAM-5157:


One possible way is to retrieve all possible options and return it to users.

> Improve user friendliness of using SET
> --
>
> Key: BEAM-5157
> URL: https://issues.apache.org/jira/browse/BEAM-5157
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> Right now, SET only supports a few defined property in Beam. However, when 
> users have some exceptions of missing properties, users would need to set 
> some properties. At this moment, users might not have clues on the property 
> names. So we should have a way to list a bunch of properties that user need 
> to set.
>  
> One example is that user should set tempLocation when using BigQuery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1302

2018-08-20 Thread Apache Jenkins Server

See 


Changes:

[kyle.winkelman] Make CompressedSource honor sourceDelegates 
emptyMatchTreatment.

--
[...truncated 19.76 MB...]
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.603Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Batch mutations together into SpannerIO.Write/Write 
mutations to Cloud Spanner/Group by partition/GroupByWindow
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.650Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike) into SpannerIO.Write/Write mutations to 
Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.721Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow)
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.788Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.835Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.875Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.917Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:32.962Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:33.008Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:33.051Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:33.097Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T22:10:33.141Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 10:10:39 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler

[beam-site] branch asf-site updated (7bb066d -> d3a72da)

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 7bb066d  Prepare repository for deployment.
 add a4a7929  Update to LTS wording to remove every Nth release clause.
 add 2a4d399  Edit suggestions
 add 3604bc0  This closes #539
 new d3a72da  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/community/policies/index.html | 2 +-
 src/community/policies.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

[beam-site] 01/01: Prepare repository for deployment.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit d3a72da435f1e18b8b4bc5546263d35e956e5942
Author: Mergebot 
AuthorDate: Mon Aug 20 22:05:32 2018 +

Prepare repository for deployment.
---
 content/community/policies/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/content/community/policies/index.html 
b/content/community/policies/index.html
index 16bd65e..bf74714 100644
--- a/content/community/policies/index.html
+++ b/content/community/policies/index.html
@@ -202,7 +202,7 @@ limitations under the License.
 
 Apache Beam makes minor releases every 6 weeks. Apache Beam has a https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com;>calendar
 for cutting the next release branch. After a release branch is cut, the 
community works quickly to finalize that release.
 
-Apache Beam aims to make 8 releases in a 12 month period. To accommodate 
users with longer upgrade cycles, some of these releases will be tagged as long 
term support (LTS) releases. Starting with the 2.7.0 release, every fourth 
release will be a LTS release. LTS releases receive patches to fix major issues 
for 12 months, starting from the release’s initial release date. LTS releases 
are considered deprecated after 12 months. Non-LTS releases do not receive 
patches and are considered d [...]
+Apache Beam aims to make 8 releases in a 12 month period. To accommodate 
users with longer upgrade cycles, some of these releases will be tagged as long 
term support (LTS) releases. LTS releases receive patches to fix major issues 
for 12 months, starting from the release’s initial release date. There will be 
at least one new LTS release in a 12 month period, and LTS releases are 
considered deprecated after 12 months. The community will mark a release as a 
LTS release based on various  [...]
 
 It is up to the Apache Beam community to decide whether an identified issue 
is a major issue that warrants a patch release. Some examples of major issues 
are high severity security issues and high risk data integrity issues.

[beam-site] 01/03: Update to LTS wording to remove every Nth release clause.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit a4a79299ff3ed1f881d080feeb021a0026ff3337
Author: Ahmet Altay 
AuthorDate: Thu Aug 16 10:52:42 2018 -0700

Update to LTS wording to remove every Nth release clause.
---
 src/community/policies.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/community/policies.md b/src/community/policies.md
index a553e76..9674c31 100644
--- a/src/community/policies.md
+++ b/src/community/policies.md
@@ -26,7 +26,7 @@ This page contains a list of major policies agreed upon by 
the Apache Beam commu
 
 Apache Beam makes minor releases every 6 weeks. Apache Beam has a 
[calendar](https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com)
 for cutting the next release branch. After a release branch is cut, the 
community works quickly to finalize that release.
 
-Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. Starting with the 2.7.0 release, every fourth release 
will be a LTS release. LTS releases receive patches to fix major issues for 12 
months, starting from the release's initial release date. LTS releases are 
considered deprecated after 12 months. Non-LTS releases do not receive patches 
and are considered depr [...]
+Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. The community will mark some releases as LTS releases 
(based on the factors such as the number of LTS releases currently in flight, 
and whether the accumulated feature set from the last LTS provides significant 
value to upgrade). There will be at least one new LTS release in a 12 month 
period. LTS releases re [...]
 
 It is up to the Apache Beam community to decide whether an identified issue is 
a major issue that warrants a patch release. Some examples of major issues are 
high severity security issues and high risk data integrity issues.

[beam-site] 03/03: This closes #539

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 3604bc09edc9ca646ce3ac3d89626698e7fa30c2
Merge: 7bb066d 2a4d399
Author: Mergebot 
AuthorDate: Mon Aug 20 22:02:27 2018 +

This closes #539

 src/community/policies.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[beam-site] 02/03: Edit suggestions

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 2a4d3991a1dd320856889e2ab5ad23fe642e17dc
Author: Melissa Pashniak 
AuthorDate: Thu Aug 16 13:23:35 2018 -0700

Edit suggestions
---
 src/community/policies.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/community/policies.md b/src/community/policies.md
index 9674c31..2b96a7f 100644
--- a/src/community/policies.md
+++ b/src/community/policies.md
@@ -26,7 +26,7 @@ This page contains a list of major policies agreed upon by 
the Apache Beam commu
 
 Apache Beam makes minor releases every 6 weeks. Apache Beam has a 
[calendar](https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com)
 for cutting the next release branch. After a release branch is cut, the 
community works quickly to finalize that release.
 
-Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. The community will mark some releases as LTS releases 
(based on the factors such as the number of LTS releases currently in flight, 
and whether the accumulated feature set from the last LTS provides significant 
value to upgrade). There will be at least one new LTS release in a 12 month 
period. LTS releases re [...]
+Apache Beam aims to make 8 releases in a 12 month period. To accommodate users 
with longer upgrade cycles, some of these releases will be tagged as long term 
support (LTS) releases. LTS releases receive patches to fix major issues for 12 
months, starting from the release's initial release date. There will be at 
least one new LTS release in a 12 month period, and LTS releases are considered 
deprecated after 12 months. The community will mark a release as a LTS release 
based on various fac [...]
 
 It is up to the Apache Beam community to decide whether an identified issue is 
a major issue that warrants a patch release. Some examples of major issues are 
high severity security issues and high risk data integrity issues.

[beam-site] branch mergebot updated (67d7fba -> 3604bc0)

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 67d7fba  This closes #521
 add 7bb066d  Prepare repository for deployment.
 new a4a7929  Update to LTS wording to remove every Nth release clause.
 new 2a4d399  Edit suggestions
 new 3604bc0  This closes #539

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../08/20/review-input-streaming-connectors.html   | 436 +
 content/blog/index.html|  17 +
 content/feed.xml   | 300 ++
 content/index.html |  10 +-
 src/community/policies.md  |   2 +-
 5 files changed, 687 insertions(+), 78 deletions(-)
 create mode 100644 
content/blog/2018/08/20/review-input-streaming-connectors.html

[jira] [Work logged] (BEAM-5033) Define and publish support story for Beam releases

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5033?focusedWorklogId=136306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136306
 ]

ASF GitHub Bot logged work on BEAM-5033:


Author: ASF GitHub Bot
Created on: 20/Aug/18 22:01
Start Date: 20/Aug/18 22:01
Worklog Time Spent: 10m 
  Work Description: melap commented on issue #539: [BEAM-5033] Update to 
LTS wording to remove every Nth release clause.
URL: https://github.com/apache/beam-site/pull/539#issuecomment-414477833
 
 
   @asfgit merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136306)
Time Spent: 1h 20m  (was: 1h 10m)

> Define and publish support story for Beam releases
> --
>
> Key: BEAM-5033
> URL: https://issues.apache.org/jira/browse/BEAM-5033
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Assignee: Ahmet Altay
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Propose, define, and publish a support story for Beam releases, clarifying 
> how long a specific release is expected to be eligible for security or 
> correctness patches.
>  
> A user could naturally infer today that only the latest released version is 
> eligible for such patches. Since we release every six weeks, this could be an 
> opportunity to establish long-term support policies for selected releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] branch master updated (fa68ae1 -> 9731467)

2018-08-20 Thread apilloud

This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from fa68ae1  Merge pull request #6219 from 
kyle-winkelman/textio-read-empty-match-treatment
 add 48455c3  remove outdated FileSystem deletion docs
 new 9731467  Merge pull request #6071 from ryan-williams/fs-docs

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java  | 3 ---
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java | 3 ---
 2 files changed, 6 deletions(-)

[jira] [Work logged] (BEAM-4843) Incorrect docs on FileSystems.delete

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4843?focusedWorklogId=136297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136297
 ]

ASF GitHub Bot logged work on BEAM-4843:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:48
Start Date: 20/Aug/18 21:48
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6071: [BEAM-4843] remove 
outdated FileSystem deletion docs
URL: https://github.com/apache/beam/pull/6071#issuecomment-414474493
 
 
   I can't find `DeleteOptions` in Beam either. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136297)
Time Spent: 20m  (was: 10m)

> Incorrect docs on FileSystems.delete
> 
>
> Key: BEAM-4843
> URL: https://issues.apache.org/jira/browse/BEAM-4843
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [The docs on {{FileSystems.delete}} 
> say|https://github.com/apache/beam/blob/b5e8335d982ee69d9f788f65f27356cddd5293d1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L332-L333]:
> bq. It is allowed but not recommended to delete directories recursively. 
> Callers depends on {@link FileSystems} and uses {@code DeleteOptions}.
> However, the function actually takes a {{MoveOptions...}} param, there's 
> never been a {{DeleteOptions}} afaict, and there is no way to recursively 
> delete a {{ResourceId}}.
> The docs should be fixed, at a minimum; actually supporting recursive delete 
> would also be nice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] 01/01: Merge pull request #6071 from ryan-williams/fs-docs

2018-08-20 Thread apilloud

This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 97314672ef840fd9dade78f96c15db6b3ed9a3ce
Merge: fa68ae1 48455c3
Author: Andrew Pilloud 
AuthorDate: Mon Aug 20 14:48:56 2018 -0700

Merge pull request #6071 from ryan-williams/fs-docs

[BEAM-4843] remove outdated FileSystem deletion docs

 sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java  | 3 ---
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java | 3 ---
 2 files changed, 6 deletions(-)

[jira] [Work logged] (BEAM-4843) Incorrect docs on FileSystems.delete

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4843?focusedWorklogId=136298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136298
 ]

ASF GitHub Bot logged work on BEAM-4843:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:48
Start Date: 20/Aug/18 21:48
Worklog Time Spent: 10m 
  Work Description: apilloud closed pull request #6071: [BEAM-4843] remove 
outdated FileSystem deletion docs
URL: https://github.com/apache/beam/pull/6071
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
index 45a1736f491..71948c2d7ab 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
@@ -126,9 +126,6 @@ protected abstract void rename(
   /**
* Deletes a collection of resources.
*
-   * It is allowed but not recommended to delete directories recursively. 
Callers depends on
-   * {@link FileSystems} and uses {@code DeleteOptions}.
-   *
* @param resourceIds the references of the resources to delete.
* @throws FileNotFoundException if resources are missing. When delete 
throws, each resource might
* or might not be deleted. In such scenarios, callers can use {@code 
match()} to determine
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
index be89c9ec099..fd7a55be562 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
@@ -329,9 +329,6 @@ public static void rename(
   /**
* Deletes a collection of resources.
*
-   * It is allowed but not recommended to delete directories recursively. 
Callers depends on
-   * {@link FileSystems} and uses {@code DeleteOptions}.
-   *
* {@code resourceIds} must have the same scheme.
*
* @param resourceIds the references of the resources to delete.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136298)
Time Spent: 0.5h  (was: 20m)

> Incorrect docs on FileSystems.delete
> 
>
> Key: BEAM-4843
> URL: https://issues.apache.org/jira/browse/BEAM-4843
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [The docs on {{FileSystems.delete}} 
> say|https://github.com/apache/beam/blob/b5e8335d982ee69d9f788f65f27356cddd5293d1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L332-L333]:
> bq. It is allowed but not recommended to delete directories recursively. 
> Callers depends on {@link FileSystems} and uses {@code DeleteOptions}.
> However, the function actually takes a {{MoveOptions...}} param, there's 
> never been a {{DeleteOptions}} afaict, and there is no way to recursively 
> delete a {{ResourceId}}.
> The docs should be fixed, at a minimum; actually supporting recursive delete 
> would also be nice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-5035) beam_PostCommit_Java_GradleBuild/1105 :beam-examples-java:compileTestJava FAILED

2018-08-20 Thread Alan Myrvold (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586559#comment-16586559
 ] 

Alan Myrvold commented on BEAM-5035:


Another example:

[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1301]

 
 bad class file: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_GradleBuild/src/runners/core-java/build/libs/beam-runners-core-java-2.7.0-SNAPSHOT-tests.jar(/org/apache/beam/runners/core/StateInternalsTest.class)
 unable to access file: java.util.zip.ZipException: invalid distance too far 
back
 Please remove or make sure it appears in the correct subdirectory of the 
classpath.

> beam_PostCommit_Java_GradleBuild/1105 :beam-examples-java:compileTestJava 
> FAILED
> 
>
> Key: BEAM-5035
> URL: https://issues.apache.org/jira/browse/BEAM-5035
> Project: Beam
>  Issue Type: Improvement
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Ahmet Altay
>Priority: Critical
>
> Compilation failed for
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1105/]
>  > Task :beam-examples-java:compileTestJava FAILED
>  
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_GradleBuild/src/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java:22:
>  error: cannot access BigqueryMatcher
>  import org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher;
>  ^
>  bad class file: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_GradleBuild/src/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.7.0-SNAPSHOT-tests.jar(/org/apache/beam/sdk/io/gcp/testing/BigqueryMatcher.class)
>  unable to access file: java.util.zip.ZipException: invalid stored block 
> lengths
>  Please remove or make sure it appears in the correct subdirectory of the 
> classpath.
>  1 error
>  
> https://github.com/apache/beam/blame/328129bf033bc6be16bc8e09af905f37b7516412/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam-site] branch asf-site updated (2bf7d7d -> 7bb066d)

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 2bf7d7d  Prepare repository for deployment.
 add 97c2ac5  Add blog post "A review of input streaming connectors"
 add bf7240b  Add authors for blog post in #521
 add 11c9c29  Fix typo for author's name in blog post #521
 add d2cf4a7  Fix other typo in author's name for blog post #521
 add c5037a2  Blog post updates based on @iemejia's feedback
 add 3cd63cd  Updates to streaming connectors blog post
 add cc68b49  Set publication date for streaming connectors blog post
 add 645574c  Update doc links in blog post to point to latest release
 add d23c996  Fix extraneous p tag and add table borders
 add 15c765f  Update streaming connectors blog post's publication date
 add 67d7fba  This closes #521
 new 7bb066d  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../08/20/review-input-streaming-connectors.html   | 436 +
 content/blog/index.html|  17 +
 content/feed.xml   | 300 ++
 content/index.html |  10 +-
 src/_data/authors.yml  |   8 +
 ...2018-08-20-review-input-streaming-connectors.md | 225 +++
 6 files changed, 919 insertions(+), 77 deletions(-)
 create mode 100644 
content/blog/2018/08/20/review-input-streaming-connectors.html
 create mode 100644 src/_posts/2018-08-20-review-input-streaming-connectors.md

[beam-site] 01/01: Prepare repository for deployment.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 7bb066d2985e4267552c0463109c06c9a7d1ab2e
Author: Mergebot 
AuthorDate: Mon Aug 20 21:42:55 2018 +

Prepare repository for deployment.
---
 .../08/20/review-input-streaming-connectors.html   | 436 +
 content/blog/index.html|  17 +
 content/feed.xml   | 300 ++
 content/index.html |  10 +-
 4 files changed, 686 insertions(+), 77 deletions(-)

diff --git a/content/blog/2018/08/20/review-input-streaming-connectors.html 
b/content/blog/2018/08/20/review-input-streaming-connectors.html
new file mode 100644
index 000..1bdde6d
--- /dev/null
+++ b/content/blog/2018/08/20/review-input-streaming-connectors.html
@@ -0,0 +1,436 @@
+
+
+
+
+  
+
+
+  
+  
+  
+  A review of input streaming connectors
+  
+  https://fonts.googleapis.com/css?family=Roboto:100,300,400; 
rel="stylesheet">
+  
+  https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js";>
+  
+  
+  
+  
+  
+  https://beam.apache.org/blog/2018/08/20/review-input-streaming-connectors.html;
 data-proofer-ignore>
+  
+  https://beam.apache.org/feed.xml;>
+  
+
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+ga('create', 'UA-73650088-1', 'auto');
+ga('send', 'pageview');
+  
+
+
+  
+
+
+
+
+  
+Toggle navigation
+
+
+
+  
+
+  
+
+  
+
+
+
+
+
+  
+
+  Get Started
+
+
+  Documentation
+
+
+  SDKS
+
+
+  RUNNERS
+
+
+  Contribute
+
+
+  Community
+
+Blog
+  
+  
+
+  https://www.apache.org/foundation/press/kit/feather_small.png; alt="Apache 
Logo" style="height:20px;">
+  
+http://www.apache.org/;>ASF Homepage
+http://www.apache.org/licenses/;>License
+http://www.apache.org/security/;>Security
+http://www.apache.org/foundation/thanks.html;>Thanks
+http://www.apache.org/foundation/sponsorship.html;>Sponsorship
+https://www.apache.org/foundation/policies/conduct;>Code of 
Conduct
+  
+
+  
+
+
+
+
+  
+
+
+
+http://schema.org/BlogPosting;>
+
+  
+A review of input 
streaming connectors
+Aug 20, 2018 •
+   Leonid Kuligin [https://twitter.com/lkulighin;>@lkulighin] 
 Julien Phalip [https://twitter.com/julienphalip;>@julienphalip]
+  
+
+  
+
+  
+In this post, you’ll learn about the current state of support for input 
streaming connectors in Apache Beam. For more context, you’ll 
also learn about the corresponding state of support in https://spark.apache.org/;>Apache Spark.
+
+With batch processing, you might load data from any source, including a 
database system. Even if there are no specific SDKs available for those 
database systems, you can often resort to using a https://en.wikipedia.org/wiki/Java_Database_Connectivity;>JDBC 
driver. With streaming, implementing a proper data pipeline is arguably more 
challenging as generally fewer source types are available. For that reason, 
this article particularly focuses on the streaming use case.
+
+Connectors for Java
+
+Beam has an official Java SDK and 
has several execution engines, called runners. In most cases it 
is fairly easy to transfer existing Beam pipelines written in Java or Scala to 
a Spark environment by using the Spark 
Runner.
+
+Spark is written in Scala and has a https://spark.apache.org/docs/latest/api/java/;>Java API. Spark’s 
source code compiles to https://en.wikipedia.org/wiki/Java_(programming_language)#Java_JVM_and_Bytecode">Java
 bytecode and the binaries are run by a https://en.wikipedia.org/wiki/Java_virtual_machine;>Java Virtual 
Machine. Scala code is interoperable with Java and therefore has native 
compatibility with Java libraries (and vice versa).
+
+Spark offers two approaches to streaming: https://spark.apache.org/docs/latest/streaming-programming-guide.html;>Discretized
 Streaming (or DStreams) and https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html;>Structured
 Streaming. DStreams are a basic abstraction that represents a continuous 
series of https://spark.apache.org/docs/latest/rdd-programming-guide.html;>Resilient
 Distributed Datasets (or RDDs). Structured Str [...]
+
+Spark Structured Streaming supports

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136295
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:41
Start Date: 20/Aug/18 21:41
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414472827
 
 
   Run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136295)
Time Spent: 6.5h  (was: 6h 20m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam-site] 03/11: Fix typo for author's name in blog post #521

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 11c9c29ed30b331674afc41f2289ca8619c0ca8e
Author: Julien Phalip 
AuthorDate: Mon Aug 6 00:38:10 2018 -0700

Fix typo for author's name in blog post #521
---
 src/_posts/2018-08-XX-review-input-streaming-connectors.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
index 7591ba2..c324d80 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -5,7 +5,7 @@ date:   2018-08-XX 00:00:01 -0800
 excerpt_separator: 
 categories: blog
 authors:
-  - lkulighin
+  - lkuligin
   - julienphalip
 ---

[beam-site] 02/11: Add authors for blog post in #521

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit bf7240be8cc46dbc5529ae341d851e26163976da
Author: Julien Phalip 
AuthorDate: Mon Aug 6 00:37:17 2018 -0700

Add authors for blog post in #521
---
 src/_data/authors.yml | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/_data/authors.yml b/src/_data/authors.yml
index aa22d5d..4ba47be 100644
--- a/src/_data/authors.yml
+++ b/src/_data/authors.yml
@@ -41,10 +41,18 @@ jamesmalone:
 jesseanderson:
 name: Jesse Anderson
 twitter: jessetanderson
+jphalip:
+name: Julien Phalip
+email: jpha...@google.com
+twitter: julienphalip
 klk:
 name: Kenneth Knowles
 email: k...@apache.org
 twitter: KennKnowles
+lkuligin:
+name: Leonid Kuligin
+email: kuli...@google.com
+twitter: lkulighin
 robertwb:
 name: Robert Bradshaw
 email: rober...@apache.org

[beam-site] 10/11: Update streaming connectors blog post's publication date

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 15c765f1bddebce303588f46a7f9fa59cf9b7e03
Author: Julien Phalip 
AuthorDate: Mon Aug 20 14:18:37 2018 -0700

Update streaming connectors blog post's publication date
---
 ...ng-connectors.md => 2018-08-20-review-input-streaming-connectors.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/_posts/2018-08-16-review-input-streaming-connectors.md 
b/src/_posts/2018-08-20-review-input-streaming-connectors.md
similarity index 99%
rename from src/_posts/2018-08-16-review-input-streaming-connectors.md
rename to src/_posts/2018-08-20-review-input-streaming-connectors.md
index 1edbc9a..4d6f104 100644
--- a/src/_posts/2018-08-16-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-20-review-input-streaming-connectors.md
@@ -1,7 +1,7 @@
 ---
 layout: post
 title:  "A review of input streaming connectors"
-date:   2018-08-16 00:00:01 -0800
+date:   2018-08-20 00:00:01 -0800
 excerpt_separator: 
 categories: blog
 authors:

[beam-site] 05/11: Blog post updates based on @iemejia's feedback

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit c5037a277bc347971635bc04d5d05e65e2acbd68
Author: Julien Phalip 
AuthorDate: Mon Aug 13 13:17:54 2018 -0400

Blog post updates based on @iemejia's feedback
---
 src/_posts/2018-08-XX-review-input-streaming-connectors.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
index aa19675..fded813 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -21,7 +21,7 @@ Spark is written in Scala and has a [Java 
API](https://spark.apache.org/docs/lat
 
 Spark offers two approaches to streaming: [Discretized 
Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html)
 (or DStreams) and [Structured 
Streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html).
 DStreams are a basic abstraction that represents a continuous series of 
[Resilient Distributed 
Datasets](https://spark.apache.org/docs/latest/rdd-programming-guide.html) (or 
RDDs). Structured Streaming was introduced more recently  [...]
 
-Spark Structured Streaming supports [file 
sources](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/DataStreamReader.html)
 (local filesystems and HDFS-compatible systems like Cloud Storage or S3) and 
[Kafka](https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html)
 as streaming inputs. Spark maintains built-in connectors for DStreams aimed at 
third-party services, such as Kafka or Flume, while other connectors are 
available through link [...]
+Spark Structured Streaming supports [file 
sources](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/DataStreamReader.html)
 (local filesystems and HDFS-compatible systems like Cloud Storage or S3) and 
[Kafka](https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html)
 as streaming 
[inputs](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#input-sources).
 Spark maintains built-in connectors for DStreams aimed  [...]
 
 Below are the main streaming input connectors for available for Beam and Spark 
DStreams in Java:
 
@@ -49,7 +49,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D
   
HDFS(Using the hdfs:// URI)

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/util/HdfsUtils.html;>HdfsUtils

@@ -93,7 +93,7 @@ and https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre

https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html;>PubsubIO

-   https://github.com/apache/bahir/tree/master/streaming-pubsub;>Spark-streaming-pubsub
 from http://bahir.apache.org;>Apache Bahir
+   https://github.com/apache/bahir/tree/master/streaming-pubsub;>spark-streaming-pubsub
 from http://bahir.apache.org;>Apache Bahir

   
   
@@ -204,11 +204,11 @@ and http://spark.apache.org/docs/latest/api/python/pyspark.streaming.ht
 
 ### **Scala**
 
-Since Scala code is interoperable with Java and therefore has native 
compatibility with Java libraries (and vice versa), you can use the same Java 
connectors described above in your Scala programs. Apache Beam also has a 
[Scala SDK](https://github.com/spotify/scio) open-sourced [by 
Spotify](https://labs.spotify.com/2017/10/16/big-data-processing-at-spotify-the-road-to-scio-part-1/).
+Since Scala code is interoperable with Java and therefore has native 
compatibility with Java libraries (and vice versa), you can use the same Java 
connectors described above in your Scala programs. Apache Beam also has a 
[Scala API](https://github.com/spotify/scio) open-sourced [by 
Spotify](https://labs.spotify.com/2017/10/16/big-data-processing-at-spotify-the-road-to-scio-part-1/).
 
 ### **Go**
 
-A [Go SDK](https://beam.apache.org/documentation/sdks/go/) for Apache Beam is 
under active development. It is currently experimental and is not recommended 
for production.
+A [Go SDK](https://beam.apache.org/documentation/sdks/go/) for Apache Beam is 
under active development. It is currently experimental and is not recommended 
for production. Spark does not have an official Go SDK.
 
 ### **R**

[beam-site] 07/11: Set publication date for streaming connectors blog post

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit cc68b49a8dbba424274012421e2fe9ee9db09a00
Author: Julien Phalip 
AuthorDate: Tue Aug 14 11:23:55 2018 -0400

Set publication date for streaming connectors blog post
---
 ...ng-connectors.md => 2018-08-16-review-input-streaming-connectors.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-16-review-input-streaming-connectors.md
similarity index 99%
rename from src/_posts/2018-08-XX-review-input-streaming-connectors.md
rename to src/_posts/2018-08-16-review-input-streaming-connectors.md
index 5816292..2b69a41 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-16-review-input-streaming-connectors.md
@@ -1,7 +1,7 @@
 ---
 layout: post
 title:  "A review of input streaming connectors"
-date:   2018-08-XX 00:00:01 -0800
+date:   2018-08-16 00:00:01 -0800
 excerpt_separator: 
 categories: blog
 authors:

[beam-site] 09/11: Fix extraneous p tag and add table borders

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit d23c9960cd415e77724b3a6878e3aafae7d1370a
Author: Melissa Pashniak 
AuthorDate: Mon Aug 20 14:09:06 2018 -0700

Fix extraneous p tag and add table borders
---
 src/_posts/2018-08-16-review-input-streaming-connectors.md | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/_posts/2018-08-16-review-input-streaming-connectors.md 
b/src/_posts/2018-08-16-review-input-streaming-connectors.md
index 72983b8..1edbc9a 100644
--- a/src/_posts/2018-08-16-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-16-review-input-streaming-connectors.md
@@ -25,7 +25,7 @@ Spark Structured Streaming supports [file 
sources](https://spark.apache.org/docs
 
 Below are the main streaming input connectors for available for Beam and Spark 
DStreams in Java:
 
-
+
   


@@ -62,7 +62,6 @@ Below are the main streaming input connectors for available 
for Beam and Spark D
FileIO + GcsOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#hadoopConfiguration--;>hadoopConfiguration
-
 and https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/StreamingContext.html#textFileStream-java.lang.String-;>textFileStream

   
@@ -118,7 +117,7 @@ Spark also has a Python SDK called 
[PySpark](http://spark.apache.org/docs/latest
 
 Below are the main streaming input connectors for available for Beam and Spark 
DStreams in Python:
 
-
+
   


@@ -204,15 +203,15 @@ and http://spark.apache.org/docs/latest/api/python/pyspark.streaming.ht
 
 ## Connectors for other languages
 
-### **Scala**
+### Scala
 
 Since Scala code is interoperable with Java and therefore has native 
compatibility with Java libraries (and vice versa), you can use the same Java 
connectors described above in your Scala programs. Apache Beam also has a 
[Scala API](https://github.com/spotify/scio) open-sourced [by 
Spotify](https://labs.spotify.com/2017/10/16/big-data-processing-at-spotify-the-road-to-scio-part-1/).
 
-### **Go**
+### Go
 
 A [Go SDK]({{ site.baseurl }}/documentation/sdks/go/) for Apache Beam is under 
active development. It is currently experimental and is not recommended for 
production. Spark does not have an official Go SDK.
 
-### **R**
+### R
 
 Apache Beam does not have an official R SDK. Spark Structured Streaming is 
supported by an [R 
SDK](https://spark.apache.org/docs/latest/sparkr.html#structured-streaming), 
but only for [file 
sources](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#input-sources)
 as a streaming input.

[beam-site] 08/11: Update doc links in blog post to point to latest release

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 645574c9dcb59afe503702381781d6afdfc2b673
Author: Julien Phalip 
AuthorDate: Wed Aug 15 10:17:50 2018 -0700

Update doc links in blog post to point to latest release
---
 ...2018-08-16-review-input-streaming-connectors.md | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/_posts/2018-08-16-review-input-streaming-connectors.md 
b/src/_posts/2018-08-16-review-input-streaming-connectors.md
index 2b69a41..72983b8 100644
--- a/src/_posts/2018-08-16-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-16-review-input-streaming-connectors.md
@@ -9,13 +9,13 @@ authors:
   - jphalip
 ---
 
-In this post, you'll learn about the current state of support for input 
streaming connectors in [Apache Beam](https://beam.apache.org/). For more 
context, you'll also learn about the corresponding state of support in [Apache 
Spark](https://spark.apache.org/).
+In this post, you'll learn about the current state of support for input 
streaming connectors in [Apache Beam]({{ site.baseurl }}/). For more context, 
you'll also learn about the corresponding state of support in [Apache 
Spark](https://spark.apache.org/).
 
 With batch processing, you might load data from any source, including a 
database system. Even if there are no specific SDKs available for those 
database systems, you can often resort to using a 
[JDBC](https://en.wikipedia.org/wiki/Java_Database_Connectivity) driver. With 
streaming, implementing a proper data pipeline is arguably more challenging as 
generally fewer source types are available. For that reason, this article 
particularly focuses on the streaming use case.
 
 ## Connectors for Java
 
-Beam has an official [Java 
SDK](https://beam.apache.org/documentation/sdks/java/) and has several 
execution engines, called 
[runners](https://beam.apache.org/documentation/runners/capability-matrix/). In 
most cases it is fairly easy to transfer existing Beam pipelines written in 
Java or Scala to a Spark environment by using the [Spark 
Runner](https://beam.apache.org/documentation/runners/spark/).
+Beam has an official [Java SDK]({{ site.baseurl }}/documentation/sdks/java/) 
and has several execution engines, called [runners]({{ site.baseurl 
}}/documentation/runners/capability-matrix/). In most cases it is fairly easy 
to transfer existing Beam pipelines written in Java or Scala to a Spark 
environment by using the [Spark Runner]({{ site.baseurl 
}}/documentation/runners/spark/).
 
 Spark is written in Scala and has a [Java 
API](https://spark.apache.org/docs/latest/api/java/). Spark's source code 
compiles to [Java 
bytecode](https://en.wikipedia.org/wiki/Java_(programming_language)#Java_JVM_and_Bytecode)
 and the binaries are run by a [Java Virtual 
Machine](https://en.wikipedia.org/wiki/Java_virtual_machine). Scala code is 
interoperable with Java and therefore has native compatibility with Java 
libraries (and vice versa).
 
@@ -41,7 +41,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D

Local(Using the file:// URI)

-   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/TextIO.html;>TextIO
+   TextIO

https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/StreamingContext.html#textFileStream-java.lang.String-;>textFileStream(Spark
 treats most Unix systems as HDFS-compatible, but the location should be 
accessible from all nodes)

@@ -49,7 +49,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D
   
HDFS(Using the hdfs:// URI)

-https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+FileIO + HadoopFileSystemOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/util/HdfsUtils.html;>HdfsUtils

@@ -59,7 +59,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D

Cloud Storage(Using the gs:// URI)

-   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.html;>GcsOptions
+   FileIO + GcsOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#hadoopConfiguration--;>hadoopConfiguration
 
@@ -69,7 +69,7 @@ and https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre
   
S3(Using the s3:// URI)

-https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 +

[beam-site] 01/11: Add blog post "A review of input streaming connectors"

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 97c2ac51e3eece847dab1323b144386f1d0c89ab
Author: Julien Phalip 
AuthorDate: Thu Aug 2 21:04:18 2018 -0700

Add blog post "A review of input streaming connectors"
---
 ...2018-08-XX-review-input-streaming-connectors.md | 224 +
 1 file changed, 224 insertions(+)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
new file mode 100644
index 000..7591ba2
--- /dev/null
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -0,0 +1,224 @@
+---
+layout: post
+title:  "A review of input streaming connectors"
+date:   2018-08-XX 00:00:01 -0800
+excerpt_separator: 
+categories: blog
+authors:
+  - lkulighin
+  - julienphalip
+---
+
+In this post, you'll learn about the current state of support for input 
streaming connectors in [Apache Beam](https://beam.apache.org/). For more 
context, you'll also learn about the corresponding state of support in [Apache 
Spark](https://spark.apache.org/).
+
+With batch processing, you might load data from any source, including a 
database system. Even if there are no specific SDKs available for those 
database systems, you can often resort to using a 
[JDBC](https://en.wikipedia.org/wiki/Java_Database_Connectivity) driver. With 
streaming, implementing a proper data pipeline is arguably more challenging as 
generally fewer source types are available. For that reason, this article 
particularly focuses on the streaming use case.
+
+## Connectors for Java
+
+Beam has an official [Java 
SDK](https://beam.apache.org/documentation/sdks/java/) and has several 
execution engines, called 
[runners](https://beam.apache.org/documentation/runners/capability-matrix/). In 
most cases it is fairly easy to transfer existing Beam pipelines written in 
Java or Scala to a Spark environment by using the [Spark 
Runner](https://beam.apache.org/documentation/runners/spark/).
+
+Spark is written in Scala and has a [Java 
API](https://spark.apache.org/docs/latest/api/java/). Spark's source code 
compiles to [Java 
bytecode](https://en.wikipedia.org/wiki/Java_(programming_language)#Java_JVM_and_Bytecode)
 and the binaries are run by a [Java Virtual 
Machine](https://en.wikipedia.org/wiki/Java_virtual_machine). Scala code is 
interoperable with Java and therefore has native compatibility with Java 
libraries (and vice versa).
+
+Spark offers two approaches to streaming: [Discretized 
Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html)
 (or DStreams) and [Structured 
Streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html).
 DStreams are a basic abstraction that represents a continuous series of 
[Resilient Distributed 
Datasets](https://spark.apache.org/docs/latest/rdd-programming-guide.html) (or 
RDDs). Structured Streaming was introduced more recently  [...]
+
+Spark Structured Streaming supports [file 
sources](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/streaming/DataStreamReader.html)
 (local filesystems and HDFS-compatible systems like Cloud Storage or S3) and 
[Kafka](https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html)
 as streaming inputs. Spark maintains built-in connectors for DStreams aimed at 
third-party services, such as Kafka or Flume, while other connectors are 
available through link [...]
+
+Below are the main streaming input connectors for available for Beam and Spark 
DStreams in Java:
+
+
+  
+   
+   
+   
+   
+   Apache Beam
+   
+   Apache Spark DStreams
+   
+  
+  
+   File Systems
+   
+   Local(Using the file:// URI)
+   
+   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/TextIO.html;>TextIO
+   
+   https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/StreamingContext.html#textFileStream-java.lang.String-;>textFileStream(Spark
 treats most Unix systems as HDFS-compatible, but the location should be 
accessible from all nodes)
+   
+  
+  
+   HDFS(Using the hdfs:// URI)
+   
+   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+   
+   https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/util/HdfsUtils.html;>HdfsUtils
+   
+  
+  
+   Object Stores
+   
+   Cloud Storage(Using the gs:// URI)
+   
+   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+   
+   https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#hadoopConfiguration--;>hadoopConfiguration
+
+and

[beam-site] 04/11: Fix other typo in author's name for blog post #521

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit d2cf4a797653a1e259de2d7c0a7ea00aec4fecc1
Author: Julien Phalip 
AuthorDate: Mon Aug 6 00:39:07 2018 -0700

Fix other typo in author's name for blog post #521
---
 src/_posts/2018-08-XX-review-input-streaming-connectors.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
index c324d80..aa19675 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -6,7 +6,7 @@ excerpt_separator: 
 categories: blog
 authors:
   - lkuligin
-  - julienphalip
+  - jphalip
 ---
 
 In this post, you'll learn about the current state of support for input 
streaming connectors in [Apache Beam](https://beam.apache.org/). For more 
context, you'll also learn about the corresponding state of support in [Apache 
Spark](https://spark.apache.org/).

[beam-site] 06/11: Updates to streaming connectors blog post

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 3cd63cdd4666bf38dfbd5448dd40155b4d6f6015
Author: Julien Phalip 
AuthorDate: Tue Aug 14 11:23:28 2018 -0400

Updates to streaming connectors blog post
---
 ...2018-08-XX-review-input-streaming-connectors.md | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/_posts/2018-08-XX-review-input-streaming-connectors.md 
b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
index fded813..5816292 100644
--- a/src/_posts/2018-08-XX-review-input-streaming-connectors.md
+++ b/src/_posts/2018-08-XX-review-input-streaming-connectors.md
@@ -41,7 +41,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D

Local(Using the file:// URI)

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/TextIO.html;>TextIO
+   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/TextIO.html;>TextIO

https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/StreamingContext.html#textFileStream-java.lang.String-;>textFileStream(Spark
 treats most Unix systems as HDFS-compatible, but the location should be 
accessible from all nodes)

@@ -49,7 +49,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D
   
HDFS(Using the hdfs:// URI)

-https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/streaming/util/HdfsUtils.html;>HdfsUtils

@@ -59,7 +59,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D

Cloud Storage(Using the gs:// URI)

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.html;>HadoopFileSystemOptions
+   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.html;>GcsOptions

https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#hadoopConfiguration--;>hadoopConfiguration
 
@@ -69,13 +69,15 @@ and https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre
   
S3(Using the s3:// URI)

+https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/FileIO.html;>FileIO
 + https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/aws/options/S3Options.html;>S3Options
+   
   
   
Messaging Queues

Kafka

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/kafka/KafkaIO.html;>KafkaIO
+   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/kafka/KafkaIO.html;>KafkaIO

https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html;>spark-streaming-kafka

@@ -83,7 +85,7 @@ and https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre
   
Kinesis

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/kinesis/KinesisIO.html;>KinesisIO
+   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/kinesis/KinesisIO.html;>KinesisIO

https://spark.apache.org/docs/latest/streaming-kinesis-integration.html;>spark-streaming-kinesis

@@ -91,7 +93,7 @@ and https://spark.apache.org/docs/latest/api/java/org/apache/spark/stre
   
Cloud Pub/Sub

-   https://beam.apache.org/documentation/sdks/javadoc/2.5.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html;>PubsubIO
+   https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html;>PubsubIO

https://github.com/apache/bahir/tree/master/streaming-pubsub;>spark-streaming-pubsub
 from http://bahir.apache.org;>Apache Bahir

@@ -132,7 +134,7 @@ Below are the main streaming input connectors for available 
for Beam and Spark D

Local

-   https://beam.apache.org/documentation/sdks/pydoc/2.5.0/apache_beam.io.textio.html;>io.textio
+   https://beam.apache.org/documentation/sdks/pydoc/2.6.0/apache_beam.io.textio.html;>io.textio

http://spark.apache.org/docs/latest/api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext.textFileStream;>textFileStream

@@ -140,7 +142,7 @@ Below

[beam-site] 11/11: This closes #521

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 67d7fba416419e527bd8bb8ff5e4d744a25828b9
Merge: 2bf7d7d 15c765f
Author: Mergebot 
AuthorDate: Mon Aug 20 21:39:57 2018 +

This closes #521

 src/_data/authors.yml  |   8 +
 ...2018-08-20-review-input-streaming-connectors.md | 225 +
 2 files changed, 233 insertions(+)

[beam-site] branch mergebot updated (0ea9291 -> 67d7fba)

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 0ea9291  This closes #458
 add 2bf7d7d  Prepare repository for deployment.
 new 97c2ac5  Add blog post "A review of input streaming connectors"
 new bf7240b  Add authors for blog post in #521
 new 11c9c29  Fix typo for author's name in blog post #521
 new d2cf4a7  Fix other typo in author's name for blog post #521
 new c5037a2  Blog post updates based on @iemejia's feedback
 new 3cd63cd  Updates to streaming connectors blog post
 new cc68b49  Set publication date for streaming connectors blog post
 new 645574c  Update doc links in blog post to point to latest release
 new d23c996  Fix extraneous p tag and add table borders
 new 15c765f  Update streaming connectors blog post's publication date
 new 67d7fba  This closes #521

The 11 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/programming-guide/index.html | 125 
 src/_data/authors.yml  |   8 +
 ...2018-08-20-review-input-streaming-connectors.md | 225 +
 3 files changed, 321 insertions(+), 37 deletions(-)
 create mode 100644 src/_posts/2018-08-20-review-input-streaming-connectors.md

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136294
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:36
Start Date: 20/Aug/18 21:36
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254#issuecomment-414471370
 
 
   Run Reproducible Env Test


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136294)
Time Spent: 3h 20m  (was: 3h 10m)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (BEAM-4839) EOF Exception writing non-english Characters to Spanner

2018-08-20 Thread Tom (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom closed BEAM-4839.
-
   Resolution: Fixed
Fix Version/s: (was: Not applicable)
   2.6.0

> EOF Exception writing non-english Characters to Spanner
> ---
>
> Key: BEAM-4839
> URL: https://issues.apache.org/jira/browse/BEAM-4839
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-dataflow
>Affects Versions: 2.3.0, 2.4.0, 2.5.0
> Environment: GCP and Local (High Sierra)
>Reporter: Tom
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 2.6.0
>
>
> I am having an issue with Apache Beam ^2.3 and Google Cloud Platform Spanner.
> In short, I'm trying to write data into Spanner. Some of this data contains 
> non-English characters, which blows up the dataflow job when using Beam 2.3 
> or higher.
> Currently, I'm trying to use Apache Beam 2.5, google-api-client 1.23 and Java 
> 1.8.
> This error occurs using Apache Beam 2.3, and not when using 2.2. When using 
> Apache Beam 2.2, we have to drop the google-http/api-client to 1.22 from 1.23.
> {quote}2.5.0{quote}
> {quote}
> 
>  org.apache.beam
>  beam-sdks-java-core
>  ${beam.version}
> 
> 
>  org.apache.beam
>  beam-runners-google-cloud-dataflow-java
>  ${beam.version}
> 
> 
>  org.apache.beam
>  beam-sdks-java-io-google-cloud-platform
>  ${beam.version}
> 
> {quote}
> The google libraries
> {quote}
> 1.23.0
> 0.26.0-alpha
> {quote}
> {quote}
> 
>  com.google.cloud
>  google-cloud
>  ${google-package.version}
> 
> 
>  com.google.http-client
>  google-http-client
>  ${google-clients.version}
>  
>  
>  
>  com.google.guava
>  guava-jdk5
>  
>  
>  
> {quote}
> Here's the stack trace. You'll see that in this quick sample runner we tried 
> inserting 4 rows. Everything runs fine until I attempt to write mutations to 
> the table.
> {quote}
> [INFO] Scanning for projects...
> [INFO]
> [INFO] < com.testing:dataflowtwofive 
> >-
> [INFO] Building dataflowtwofive 0.1
> [INFO] [ jar 
> ]-
> [WARNING] The POM for com.google.oauth-client:google-oauth-client:jar:1.23.0 
> is invalid, transitive dependencies (if any) will not be available, enable 
> debug logging for more details
> [WARNING] The POM for 
> com.google.http-client:google-http-client-jackson:jar:1.23.0 is invalid, 
> transitive dependencies (if any) will not be available, enable debug logging 
> for more details
> [WARNING] The POM for 
> com.google.apis:google-api-services-storage:jar:v1-rev114-1.23.0 is invalid, 
> transitive dependencies (if any) will not be available, enable debug logging 
> for more details
> [INFO]
> [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
> dataflowtwofive ---
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 0 resource
> [INFO]
> [INFO] --- maven-compiler-plugin:3.6.1:compile (default-compile) @ 
> dataflowtwofive ---
> [INFO] Nothing to compile - all classes are up to date
> [INFO]
> [INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ dataflowtwofive ---
> Jul 19, 2018 3:22:41 PM runners.SimpleTestRunner main
> WARNING: ATTN: Coder for mutations: [class com.google.cloud.spanner.Mutation]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: 
> [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.76500Z, 
> ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.80900Z, 
> GUID=f2360671-557d-4406-b693-c0d66..., LAHQ=a, LAISO=a2, LASPEZ=a3, SPRAS=a4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: 
> [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.76500Z, 
> ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.80900Z, 
> GUID=cba60659-6986-4dc3-a523-116e2..., LAHQ=á, LAISO=c2, LASPEZ=c3, SPRAS=c4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: 
> [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.76500Z, 
> ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.80900Z, 
> GUID=889d9d14-df57-4d72-8748-8de0f..., LAHQ=d, LAISO=d2, LASPEZ=d3, SPRAS=d4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: 
> [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.76500Z, 
> ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.80900Z, 
> GUID=888a2a76-aff7-4f64-9e24-2c231..., LAHQ=b, LAISO=b2, LASPEZ=b3, SPRAS=b4}]
> [WARNING]
> java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
>  at

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136291
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:32
Start Date: 20/Aug/18 21:32
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254#issuecomment-414470333
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136291)
Time Spent: 3h 10m  (was: 3h)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (BEAM-4772) TextIO.read transform does not respect .withEmptyMatchTreatment

2018-08-20 Thread Andrew Pilloud (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud resolved BEAM-4772.
--
   Resolution: Fixed
Fix Version/s: 2.7.0

> TextIO.read transform does not respect .withEmptyMatchTreatment
> ---
>
> Key: BEAM-4772
> URL: https://issues.apache.org/jira/browse/BEAM-4772
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Samuel Waggoner
>Assignee: Kyle Winkelman
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I modified the MinimalWordCount example to reproduce. I expect the read 
> transform to read 0 lines rather than give an exception, since I used 
> EmptyMatchTreatment.ALLOW. I see the same behavior with ALLOW_IF_WILDCARD. 
> The EmptyMatchTreatment value seems to be ignored.
> {code:java}
> public class MinimalWordCount {
>  public static void main(String[] args) {
>PipelineOptions options = PipelineOptionsFactory.create();
>Pipeline p = Pipeline.create(options);
>p.apply(TextIO.read()
>  .from("gs://apache-beam-samples/doesnotexist/*")
>  .withEmptyMatchTreatment(EmptyMatchTreatment.ALLOW))
> .apply(TextIO.write().to("wordcounts"));
>p.run().waitUntilFinish();
>  }
> }
> {code}
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  at org.apache.beam.examples.MinimalWordCount.main(MinimalWordCount.java:124)
> Caused by: java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>  at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>  at 
> org.apache.beam.sdk.io.FileBasedSource.getEstimatedSizeBytes(FileBasedSource.java:222)
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:212)
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:91)
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81){code}
> We see this behavior both when using DirectRunner and DataflowRunner 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] branch master updated (5655acc -> fa68ae1)

2018-08-20 Thread apilloud

This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 5655acc  minor : better if condition
 add 549bac3  Make CompressedSource honor sourceDelegates 
emptyMatchTreatment.
 new fa68ae1  Merge pull request #6219 from 
kyle-winkelman/textio-read-empty-match-treatment

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java  | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

[jira] [Work logged] (BEAM-4772) TextIO.read transform does not respect .withEmptyMatchTreatment

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4772?focusedWorklogId=136290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136290
 ]

ASF GitHub Bot logged work on BEAM-4772:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:27
Start Date: 20/Aug/18 21:27
Worklog Time Spent: 10m 
  Work Description: apilloud closed pull request #6219: [BEAM-4772] Make 
CompressedSource honor sourceDelegates emptyMatchTreatment.
URL: https://github.com/apache/beam/pull/6219
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java
index ae00de82275..dcb1be1ba99 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java
@@ -172,7 +172,10 @@ static DecompressingChannelFactory 
fromCanonical(Compression compression) {
*/
   private CompressedSource(
   FileBasedSource sourceDelegate, DecompressingChannelFactory 
channelFactory) {
-super(sourceDelegate.getFileOrPatternSpecProvider(), Long.MAX_VALUE);
+super(
+sourceDelegate.getFileOrPatternSpecProvider(),
+sourceDelegate.getEmptyMatchTreatment(),
+Long.MAX_VALUE);
 this.sourceDelegate = sourceDelegate;
 this.channelFactory = channelFactory;
   }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136290)
Time Spent: 0.5h  (was: 20m)

> TextIO.read transform does not respect .withEmptyMatchTreatment
> ---
>
> Key: BEAM-4772
> URL: https://issues.apache.org/jira/browse/BEAM-4772
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Samuel Waggoner
>Assignee: Kyle Winkelman
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I modified the MinimalWordCount example to reproduce. I expect the read 
> transform to read 0 lines rather than give an exception, since I used 
> EmptyMatchTreatment.ALLOW. I see the same behavior with ALLOW_IF_WILDCARD. 
> The EmptyMatchTreatment value seems to be ignored.
> {code:java}
> public class MinimalWordCount {
>  public static void main(String[] args) {
>PipelineOptions options = PipelineOptionsFactory.create();
>Pipeline p = Pipeline.create(options);
>p.apply(TextIO.read()
>  .from("gs://apache-beam-samples/doesnotexist/*")
>  .withEmptyMatchTreatment(EmptyMatchTreatment.ALLOW))
> .apply(TextIO.write().to("wordcounts"));
>p.run().waitUntilFinish();
>  }
> }
> {code}
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  at org.apache.beam.examples.MinimalWordCount.main(MinimalWordCount.java:124)
> Caused by: java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>  at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>  at 
> org.apache.beam.sdk.io.FileBasedSource.getEstimatedSizeBytes(FileBasedSource.java:222)
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:212)
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:91)
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81){code}
> We see this behavior both when using DirectRunner and DataflowRunner 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] 01/01: Merge pull request #6219 from kyle-winkelman/textio-read-empty-match-treatment

2018-08-20 Thread apilloud

This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit fa68ae1c6a63a3a59b5c6ccda4684b562e54e357
Merge: 5655acc 549bac3
Author: Andrew Pilloud 
AuthorDate: Mon Aug 20 14:27:29 2018 -0700

Merge pull request #6219 from 
kyle-winkelman/textio-read-empty-match-treatment

[BEAM-4772] Make CompressedSource honor sourceDelegates emptyMatchTreatment.

 .../core/src/main/java/org/apache/beam/sdk/io/CompressedSource.java  | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

[jira] [Work logged] (BEAM-4772) TextIO.read transform does not respect .withEmptyMatchTreatment

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4772?focusedWorklogId=136288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136288
 ]

ASF GitHub Bot logged work on BEAM-4772:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:26
Start Date: 20/Aug/18 21:26
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6219: [BEAM-4772] Make 
CompressedSource honor sourceDelegates emptyMatchTreatment.
URL: https://github.com/apache/beam/pull/6219#issuecomment-414468672
 
 
   Looks like a oversight caused by keeping around the old `FileBasedSource` 
constructor when `emptyMatchTreatment` was added. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136288)
Time Spent: 20m  (was: 10m)

> TextIO.read transform does not respect .withEmptyMatchTreatment
> ---
>
> Key: BEAM-4772
> URL: https://issues.apache.org/jira/browse/BEAM-4772
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Samuel Waggoner
>Assignee: Kyle Winkelman
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I modified the MinimalWordCount example to reproduce. I expect the read 
> transform to read 0 lines rather than give an exception, since I used 
> EmptyMatchTreatment.ALLOW. I see the same behavior with ALLOW_IF_WILDCARD. 
> The EmptyMatchTreatment value seems to be ignored.
> {code:java}
> public class MinimalWordCount {
>  public static void main(String[] args) {
>PipelineOptions options = PipelineOptionsFactory.create();
>Pipeline p = Pipeline.create(options);
>p.apply(TextIO.read()
>  .from("gs://apache-beam-samples/doesnotexist/*")
>  .withEmptyMatchTreatment(EmptyMatchTreatment.ALLOW))
> .apply(TextIO.write().to("wordcounts"));
>p.run().waitUntilFinish();
>  }
> }
> {code}
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  at org.apache.beam.examples.MinimalWordCount.main(MinimalWordCount.java:124)
> Caused by: java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>  at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>  at 
> org.apache.beam.sdk.io.FileBasedSource.getEstimatedSizeBytes(FileBasedSource.java:222)
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:212)
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:91)
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81){code}
> We see this behavior both when using DirectRunner and DataflowRunner 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (BEAM-4772) TextIO.read transform does not respect .withEmptyMatchTreatment

2018-08-20 Thread Andrew Pilloud (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud reassigned BEAM-4772:


Assignee: Kyle Winkelman  (was: Kenneth Knowles)

> TextIO.read transform does not respect .withEmptyMatchTreatment
> ---
>
> Key: BEAM-4772
> URL: https://issues.apache.org/jira/browse/BEAM-4772
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Samuel Waggoner
>Assignee: Kyle Winkelman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I modified the MinimalWordCount example to reproduce. I expect the read 
> transform to read 0 lines rather than give an exception, since I used 
> EmptyMatchTreatment.ALLOW. I see the same behavior with ALLOW_IF_WILDCARD. 
> The EmptyMatchTreatment value seems to be ignored.
> {code:java}
> public class MinimalWordCount {
>  public static void main(String[] args) {
>PipelineOptions options = PipelineOptionsFactory.create();
>Pipeline p = Pipeline.create(options);
>p.apply(TextIO.read()
>  .from("gs://apache-beam-samples/doesnotexist/*")
>  .withEmptyMatchTreatment(EmptyMatchTreatment.ALLOW))
> .apply(TextIO.write().to("wordcounts"));
>p.run().waitUntilFinish();
>  }
> }
> {code}
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>  at org.apache.beam.examples.MinimalWordCount.main(MinimalWordCount.java:124)
> Caused by: java.io.FileNotFoundException: No files matched spec: 
> gs://apache-beam-samples/doesnotexist/*
>  at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>  at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>  at 
> org.apache.beam.sdk.io.FileBasedSource.getEstimatedSizeBytes(FileBasedSource.java:222)
>  at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:212)
>  at 
> org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:91)
>  at 
> org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81){code}
> We see this behavior both when using DirectRunner and DataflowRunner 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5169) Add options for master URL and log level to Flink jobserver runShadow task

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5169?focusedWorklogId=136287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136287
 ]

ASF GitHub Bot logged work on BEAM-5169:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:15
Start Date: 20/Aug/18 21:15
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6249: [BEAM-5169] Add 
options for master URL and log level to Flink jobserver runShadow task.
URL: https://github.com/apache/beam/pull/6249#issuecomment-414465697
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136287)
Time Spent: 0.5h  (was: 20m)

> Add options for master URL and log level to Flink jobserver runShadow task
> --
>
> Key: BEAM-5169
> URL: https://issues.apache.org/jira/browse/BEAM-5169
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5168) Flink jobserver logging should be redirected to slf4j

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5168?focusedWorklogId=136286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136286
 ]

ASF GitHub Bot logged work on BEAM-5168:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:14
Start Date: 20/Aug/18 21:14
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6248: [BEAM-5168] Redirect 
Flink jobserver commons-logging to slf4j.
URL: https://github.com/apache/beam/pull/6248#issuecomment-414465336
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136286)
Time Spent: 0.5h  (was: 20m)

> Flink jobserver logging should be redirected to slf4j
> -
>
> Key: BEAM-5168
> URL: https://issues.apache.org/jira/browse/BEAM-5168
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently commons-logging goes nowhere, which makes certain issues really 
> hard to debug (http client uses it, for example). This can be fixed by using 
> the slf4j bridge.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136283
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:09
Start Date: 20/Aug/18 21:09
Worklog Time Spent: 10m 
  Work Description: yifanzou opened a new pull request #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136283)
Time Spent: 2h 50m  (was: 2h 40m)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136284
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:09
Start Date: 20/Aug/18 21:09
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6254: Do Not Merge 
[BEAM-3954] Reproducible environment for Beam Jenkins tests
URL: https://github.com/apache/beam/pull/6254#issuecomment-414464159
 
 
   Run Python PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136284)
Time Spent: 3h  (was: 2h 50m)

> Get Jenkins agents dockerized 
> --
>
> Key: BEAM-3954
> URL: https://issues.apache.org/jira/browse/BEAM-3954
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136282
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 21:01
Start Date: 20/Aug/18 21:01
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6216: [BEAM-5141] Improve 
error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414461964
 
 
   This is a good idea. I can explore how to check option during SET.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136282)
Time Spent: 6.5h  (was: 6h 20m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136278
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:55
Start Date: 20/Aug/18 20:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6216: [BEAM-5141] Improve 
error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414460220
 
 
   That also sounds like a good idea and seems like it would be simpler to 
maintain. The cost of PipelineOptionsFactory.fromArgs(...) for each `SET` isn't 
a large price to pay.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136278)
Time Spent: 6h 20m  (was: 6h 10m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5180) Broken FileResultCoder via parseSchema change

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5180?focusedWorklogId=136277=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136277
 ]

ASF GitHub Bot logged work on BEAM-5180:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:53
Start Date: 20/Aug/18 20:53
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6251: [BEAM-5180] Relax 
back restriction on parsing file scheme
URL: https://github.com/apache/beam/pull/6251#issuecomment-414459555
 
 
   FYI @angoenka @jkff


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136277)
Time Spent: 0.5h  (was: 20m)

> Broken FileResultCoder via parseSchema change
> -
>
> Key: BEAM-5180
> URL: https://issues.apache.org/jira/browse/BEAM-5180
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Jozef Vilcek
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Recently this commit
> [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384]
> introduced more strict schema parsing which is breaking the contract between 
> _FileResultCoder_ and _FileSystems.matchNewResource()_.
> Coder takes _ResourceId_ and serialize it via `_toString_` methods and then 
> relies on filesystem being able to parse it back again. Having strict 
> _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for 
> _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_`
> I guess the _ResourceIdCoder_ is suffering the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5180) Broken FileResultCoder via parseSchema change

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5180?focusedWorklogId=136276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136276
 ]

ASF GitHub Bot logged work on BEAM-5180:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:52
Start Date: 20/Aug/18 20:52
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6251: [BEAM-5180] Relax 
back restriction on parsing file scheme
URL: https://github.com/apache/beam/pull/6251#issuecomment-414459331
 
 
   From your bug report, it looks like you need `hdfs:/some/path` to work?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136276)
Time Spent: 20m  (was: 10m)

> Broken FileResultCoder via parseSchema change
> -
>
> Key: BEAM-5180
> URL: https://issues.apache.org/jira/browse/BEAM-5180
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Jozef Vilcek
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Recently this commit
> [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384]
> introduced more strict schema parsing which is breaking the contract between 
> _FileResultCoder_ and _FileSystems.matchNewResource()_.
> Coder takes _ResourceId_ and serialize it via `_toString_` methods and then 
> relies on filesystem being able to parse it back again. Having strict 
> _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for 
> _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_`
> I guess the _ResourceIdCoder_ is suffering the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=136274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136274
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:44
Start Date: 20/Aug/18 20:44
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on a change in pull request #6220: 
[BEAM-3194] Add ValidatesRunner test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r211401274
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoRequiresStableInputTest.java
 ##
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.util.Date;
+import java.util.UUID;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.options.Validation.Required;
+import org.apache.beam.sdk.testing.FileChecksumMatcher;
+import org.apache.beam.sdk.testing.RetryFailures;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipelineOptions;
+import org.apache.beam.sdk.testing.ValidatesRunner;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TupleTagList;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * ValidatesRunner test for the support of {@link
+ * org.apache.beam.sdk.transforms.DoFn.RequiresStableInput} annotation.
+ */
+@RunWith(JUnit4.class)
+public class ParDoRequiresStableInputTest {
+
+  private static final String VALUE = "value";
+  // SHA-1 hash of string "value"
+  private static final String VALUE_CHECKSUM = 
"f32b67c7e26342af42efabc674d441dca0a281c5";
+
+  private static class PairWithRandomKeyFn extends SimpleFunction> {
+@Override
+public KV apply(String value) {
+  String key = UUID.randomUUID().toString();
+  return KV.of(key, value);
+}
+  }
+
+  private static class MakeSideEffectAndThenFailFn extends DoFn, String> {
+private final String outputPrefix;
+
+private MakeSideEffectAndThenFailFn(String outputPrefix) {
+  this.outputPrefix = outputPrefix;
+}
+
+@RequiresStableInput
+@ProcessElement
+public void processElement(ProcessContext c) throws Exception {
+  MatchResult matchResult = FileSystems.match(outputPrefix + "*");
+  boolean firstTime = (matchResult.metadata().size() == 0);
+
+  KV kv = c.element();
+  writeTextToFileSideEffect(kv.getValue(), outputPrefix + kv.getKey());
+  if (firstTime) {
+throw new Exception("Deliberate failure: should happen only once.");
+  }
+}
+
+private static void writeTextToFileSideEffect(String text, String 
filename) throws IOException {
+  ResourceId rid = FileSystems.matchNewResource(filename, false);
+  WritableByteChannel chan = FileSystems.create(rid, "text/plain");
+  chan.write(ByteBuffer.wrap(text.getBytes(Charset.defaultCharset(;
+  chan.close();
+}
+  }
+
+  private static void 
runRequiresStableInputPipeline(RequiresStableInputTestOptions options) {
+Pipeline p = Pipeline.create(options);
+
+PCollection singleton = p.apply("CreatePCollectionOfOneValue", 
Create.of(VALUE));
+singleton
+.apply("Single-PairWithRandomKey", MapElements.via(new 
PairWithRandomKeyFn()))
+.apply(
+"Single-MakeSideEffectAndThenFail",
+ParDo.of(new

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136273
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:42
Start Date: 20/Aug/18 20:42
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6216: [BEAM-5141] Improve 
error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414456316
 
 
   I think the thing that might be unclear here is that validation happens at 
the execution of `SELECT` not the execution of `SET`. If we were doing 
validation at `SET` time, what you describe would be possible without a 
implementation of validation. Changing that behavior seems like it would 
actually be the correct fix here. If we validated the Pipeline Options when 
they were `SET` there would be no need to change the error message as we would 
just refuse to set invalid option and the existing message would still be 
correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136273)
Time Spent: 6h 10m  (was: 6h)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1301

2018-08-20 Thread Apache Jenkins Server

See 


Changes:

[pablo] bugfixs :  only use function not use attribute of __dict__

[pablo] minor : better if condition

--
[...truncated 19.79 MB...]
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.062Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Batch mutations together into SpannerIO.Write/Write 
mutations to Cloud Spanner/Group by partition/GroupByWindow
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.114Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike) into SpannerIO.Write/Write mutations to 
Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.160Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow)
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.206Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.257Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.299Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.345Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.389Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.423Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.459Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.505Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 20, 2018 8:29:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-20T20:29:10.565Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 20, 2018 8:29:13 PM

[jira] [Work logged] (BEAM-3954) Get Jenkins agents dockerized

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3954?focusedWorklogId=136269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136269
 ]

ASF GitHub Bot logged work on BEAM-3954:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:29
Start Date: 20/Aug/18 20:29
Worklog Time Spent: 10m 
  Work Description: yifanzou closed pull request #6243: Do Not Merge 
[BEAM-3954] Create reproducible environment for Jenkins build
URL: https://github.com/apache/beam/pull/6243
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/dockerized-jenkins/build-env/Dockerfile 
b/.test-infra/dockerized-jenkins/build-env/Dockerfile
new file mode 100644
index 000..c26f1abe6ad
--- /dev/null
+++ b/.test-infra/dockerized-jenkins/build-env/Dockerfile
@@ -0,0 +1,69 @@
+###
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+###
+
+FROM openjdk:8-jdk
+MAINTAINER Beam d...@beam.apache.org
+
+# Install dependencies for Jenkins Python tests
+RUN apt-get update && \
+apt-get install -y \
+  python-pip \
+  python-virtualenv \
+  python-dev \
+  python-tox \
+  maven \
+  rsync \
+  time \
+  && rm -rf /var/lib/apt/lists/*
+
+RUN wget 
https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-189.0.0-linux-x86_64.tar.gz
 -O gcloud.tar.gz && \
+tar xf gcloud.tar.gz && \
+./google-cloud-sdk/install.sh --quiet
+ENV PATH="/google-cloud-sdk/bin:${PATH}"
+RUN gcloud components update --quiet || echo 'gcloud components update failed' 
&& \
+rm gcloud.tar.gz
+
+# Add the entrypoint script which install gcloud sdk
+#COPY docker-entrypoint.sh /usr/local/bin/
+#RUN ln -s usr/local/bin/docker-entrypoint.sh /entrypoint.sh
+#ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
+
+# Create beam user to validate the build on user space
+ENV USER=user \
+UID=1001 \
+GID=1002 \
+HOME=/home/user
+RUN groupadd --system --gid=$GID $USER; \
+useradd --system --uid=$UID --gid $USER $USER;
+RUN mkdir -p $HOME; \
+chown -R $USER:$USER $HOME; \
+chmod 777 $HOME;
+USER $USER
+WORKDIR $HOME
+#COPY my-first-project-190318.json $HOME/credentials.json
+#RUN  gcloud auth activate-service-account --key-file=credentials.json && \
+# gcloud config set project my-first-project-190318 && \
+# rm credentials.json
+
+
+#ARG URL=https://github.com/apache/beam
+#
+#RUN git clone $URL beam; \
+#  cd beam; \
+#  git config --local --add remote.origin.fetch 
'+refs/pull/*/head:refs/remotes/origin/pr/*'; \
+#  git fetch --quiet --all;
diff --git a/.test-infra/jenkins/PrecommitJobBuilder.groovy 
b/.test-infra/jenkins/PrecommitJobBuilder.groovy
index 37c46c35728..3580d864340 100644
--- a/.test-infra/jenkins/PrecommitJobBuilder.groovy
+++ b/.test-infra/jenkins/PrecommitJobBuilder.groovy
@@ -97,6 +97,11 @@ class PrecommitJobBuilder {
   'master',
   timeoutMins,
   allowRemotePoll) // needed for included regions PR triggering; see 
[JENKINS-23606]
+  wrappers {
+buildInDocker {
+  dockerfile('src/.test-infra/jenkins/', 'Dockerfile')
+}
+  }
   steps {
 gradle {
   rootBuildScriptDir(commonJobProperties.checkoutDir)
diff --git a/.test-infra/jenkins/job_docker_test.groovy 
b/.test-infra/jenkins/job_docker_test.groovy
new file mode 100644
index 000..1f624e41890
--- /dev/null
+++ b/.test-infra/jenkins/job_docker_test.groovy
@@ -0,0 +1,33 @@
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136268
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:28
Start Date: 20/Aug/18 20:28
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414452099
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136268)
Time Spent: 6h 20m  (was: 6h 10m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-5141) Improve error message when SET unregistered options

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-5141?focusedWorklogId=136267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136267
 ]

ASF GitHub Bot logged work on BEAM-5141:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:26
Start Date: 20/Aug/18 20:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6216: [BEAM-5141] Improve 
error message when SET unregistered options
URL: https://github.com/apache/beam/pull/6216#issuecomment-414451527
 
 
   Your implementing custom input validation to get a custom SQL message in 
this PR (just using an exception to change the contract).
   
   The property descriptors should only be returned if all the classes are 
valid. So the majority of the validation is being done by 
PipelineOptionsFactory. The validation provided by SQL would be checking the 
list of already validated property descriptors for the property the user is 
trying to set.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136267)
Time Spent: 6h  (was: 5h 50m)

> Improve error message when SET unregistered options 
> 
>
> Key: BEAM-5141
> URL: https://issues.apache.org/jira/browse/BEAM-5141
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136265
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:25
Start Date: 20/Aug/18 20:25
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414451194
 
 
   The seed job was overwritten by my test script accidentally. Re-run the seed 
job to recover the precommits...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136265)
Time Spent: 6h  (was: 5h 50m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136266
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:25
Start Date: 20/Aug/18 20:25
Worklog Time Spent: 10m 
  Work Description: yifanzou edited a comment on issue #5926: [BEAM-4723] 
[SQL] Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414451194
 
 
   The seed job was overwritten by my test script accidentally. Re-running the 
seed job to recover the precommits...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136266)
Time Spent: 6h 10m  (was: 6h)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136259
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:24
Start Date: 20/Aug/18 20:24
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414450882
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136259)
Time Spent: 5h 50m  (was: 5h 40m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam-site] 01/01: Prepare repository for deployment.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 2bf7d7d73b4cbadc76e4543306586abd583d3ea7
Author: Mergebot 
AuthorDate: Mon Aug 20 20:23:04 2018 +

Prepare repository for deployment.
---
 content/documentation/programming-guide/index.html | 125 +++--
 1 file changed, 88 insertions(+), 37 deletions(-)

diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index ade9977..6093382 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -456,8 +456,8 @@ potentially include information such as your project ID or 
a location for
 storing files.
 
 When you run the pipeline on a runner of your choice, a copy of the
-PipelineOptions will be available to your code. For example, you can read
-PipelineOptions from a DoFn’s Context.
+PipelineOptions will be available to your code. For example, if you add a 
PipelineOptions parameter
+to a DoFn’s @ProcessElement method, it 
will be populated by the system.
 
 2.1.1. Setting PipelineOptions from command-line 
arguments
 
@@ -971,9 +971,13 @@ look like this:
 Inside your DoFn subclass, you’ll write a method annotated 
with
 @ProcessElement where you provide the 
actual processing logic. You don’t need
 to manually extract the elements from the input collection; the Beam SDKs 
handle
-that for you. Your @ProcessElement 
method should accept an object of type
-ProcessContext. The ProcessContext object gives you access to an 
input
-element and a method for emitting an output element:
+that for you. Your @ProcessElement 
method should accept a parameter tagged with
+@Element, which will be populated with 
the input element. In order to output
+elements, the method can also take a parameter of type OutputReceiver which
+provides a method for emitting elements. The parameter types must match the 
input
+and output types of your DoFn or the 
framework will raise an error. Note: @Element and
+OutputReceiver were introduced in Beam 2.5.0; if using an earlier release of 
Beam, a
+ProcessContext parameter should be used instead.
 
 Inside your DoFn 
subclass, you’ll write a method process 
where you provide
 the actual processing logic. You don’t need to manually extract the elements
@@ -984,11 +988,9 @@ method.
 
 static class ComputeWordLengthFn extends DoFnString, 
Integer {
   @ProcessElement
-  public void processElement(ProcessContext c) {
-// Get the input element from ProcessContext.
-String word = c.element();
-// Use ProcessContext.output to emit the output 
element.
-c.output(word.length());
+  public void processElement(@Element String word, OutputReceiverInteger out) {
+// Use OutputReceiver.output to emit the output 
element.
+out.output(word.length());
   }
 }
 
@@ -1002,8 +1004,8 @@ method.
 
 
   Note: If the elements in your input PCollection are key/value pairs, you
-can access the key or value by using ProcessContext.element().getKey() or
-ProcessContext.element().getValue(), 
respectively.
+can access the key or value by using element.getKey() or
+element.getValue(), respectively.
 
 
 A given DoFn instance generally gets 
invoked one or more times to process some
@@ -1020,10 +1022,10 @@ following requirements:
 
 
   You should not in any way modify an element returned by
-ProcessContext.element() or ProcessContext.sideInput() (the incoming
+the @Element annotation or ProcessContext.sideInput() (the incoming
 elements from the input collection).
-  Once you output a value using ProcessContext.output() or
-ProcessContext.sideOutput(), you should 
not modify that value in any way.
+  Once you output a value using OutputReceiver.output() you should not modify
+that value in any way.
 
 
 4.2.1.3. Lightweight DoFns and other 
abstractions
@@ -1047,8 +1049,8 @@ elements from the input collection).
   "ComputeWordLengths",  
   // the transform name
   ParDo.of(new DoFnString, Integer() {  
  // a DoFn as an anonymous inner class instance
   @ProcessElement
-  public void processElement(ProcessContext c) {
-c.output(c.element().length());
+  public void processElement(@Element String word, OutputReceiverInteger out) {
+out.output(word.length());
   }
 }));
 
@@ -1756,7 +1758,9 @@ state dependency in your user code.
 Beam SDKs are not thread-safe.
 
 
-In addition, it’s recommended that you make your function object 
idempotent.
+In addition, it’s recommended that you make your function object 
idempotent.
+Non-idempotent functions are supported by Beam, but require additional
+thought to ensure correctness when there are external side effects.
 
 
   Note: These requirements apply to subclasses of DoFn (a function object
@@ -1802,10 +1806,10 @@ function may be accessed from different threads.

[beam-site] branch asf-site updated (6a90c60 -> 2bf7d7d)

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 6a90c60  Prepare repository for deployment.
 add 277cf87  Update programming guide to suggest using NewDoFn approach to 
parameters.
 add 18f6542  Address comments.
 add 5a7b4f9  Add compatibility warning.
 add 3b4c7f0  Fix language tab tags
 add 0ea9291  This closes #458
 new 2bf7d7d  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/programming-guide/index.html | 125 ++--
 src/documentation/programming-guide.md | 131 +++--
 2 files changed, 183 insertions(+), 73 deletions(-)

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-20 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=136258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136258
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 20/Aug/18 20:22
Start Date: 20/Aug/18 20:22
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-414450448
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 136258)
Time Spent: 5h 40m  (was: 5.5h)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam-site] 01/05: Update programming guide to suggest using NewDoFn approach to parameters.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 277cf87efe02b925e4c893716e33c35ba1b13c79
Author: Reuven Lax 
AuthorDate: Tue Jun 5 14:49:46 2018 +0300

Update programming guide to suggest using NewDoFn approach to parameters.
---
 src/documentation/programming-guide.md | 129 +++--
 1 file changed, 92 insertions(+), 37 deletions(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 4b55fcb..aca4712 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -139,8 +139,8 @@ potentially include information such as your project ID or 
a location for
 storing files.
 
 When you run the pipeline on a runner of your choice, a copy of the
-PipelineOptions will be available to your code. For example, you can read
-PipelineOptions from a DoFn's Context.
+PipelineOptions will be available to your code. For example, if you add a 
PipelineOptions parameter
+to a DoFn's `@ProcessElement` method, it will be populated by the system.
 
  2.1.1. Setting PipelineOptions from command-line arguments 
{#pipeline-options-cli}
 
@@ -623,9 +623,11 @@ static class ComputeWordLengthFn extends DoFn { ... }
 Inside your `DoFn` subclass, you'll write a method annotated with
 `@ProcessElement` where you provide the actual processing logic. You don't need
 to manually extract the elements from the input collection; the Beam SDKs 
handle
-that for you. Your `@ProcessElement` method should accept an object of type
-`ProcessContext`. The `ProcessContext` object gives you access to an input
-element and a method for emitting an output element:
+that for you. Your `@ProcessElement` method should accept parameter tagged with
+`@Element`, which will be populated with the input element. In order to output
+elements, the method can also take a parameter of type `OutputReceiver` which
+provides a method for emitting elements. The parameter types must match the 
input
+and output types of your `DoFn` or the framework will raise an error.
 
 {:.language-py}
 Inside your `DoFn` subclass, you'll write a method `process` where you provide
@@ -638,11 +640,9 @@ method.
 ```java
 static class ComputeWordLengthFn extends DoFn {
   @ProcessElement
-  public void processElement(ProcessContext c) {
-// Get the input element from ProcessContext.
-String word = c.element();
-// Use ProcessContext.output to emit the output element.
-c.output(word.length());
+  public void processElement(@Element String word, OutputReceiver 
out) {
+// Use OutputReceiver.output to emit the output element.
+out.output(word.length());
   }
 }
 ```
@@ -653,8 +653,8 @@ static class ComputeWordLengthFn extends DoFn {
 
 {:.language-java}
 > **Note:** If the elements in your input `PCollection` are key/value pairs, 
 > you
-> can access the key or value by using `ProcessContext.element().getKey()` or
-> `ProcessContext.element().getValue()`, respectively.
+> can access the key or value by using `element.getKey()` or
+> `element.getValue()`, respectively.
 
 A given `DoFn` instance generally gets invoked one or more times to process 
some
 arbitrary bundle of elements. However, Beam doesn't guarantee an exact number 
of
@@ -670,10 +670,10 @@ following requirements:
 
 {:.language-java}
 * You should not in any way modify an element returned by
-  `ProcessContext.element()` or `ProcessContext.sideInput()` (the incoming
+  the `@Element` annotation or `ProcessContext.sideInput()` (the incoming
   elements from the input collection).
-* Once you output a value using `ProcessContext.output()` or
-  `ProcessContext.sideOutput()`, you should not modify that value in any way.
+* Once you output a value using `OutputReceiver.output()` you should not modify
+  that value in any way.
 
 # 4.2.1.3. Lightweight DoFns and other abstractions {#lightweight-dofns}
 
@@ -697,8 +697,8 @@ PCollection wordLengths = words.apply(
   "ComputeWordLengths", // the transform name
   ParDo.of(new DoFn() {// a DoFn as an anonymous inner 
class instance
   @ProcessElement
-  public void processElement(ProcessContext c) {
-c.output(c.element().length());
+  public void processElement(@Element String word, OutputReceiver out) {
+out.output(word.length());
   }
 }));
 ```
@@ -1043,7 +1043,7 @@ you need the combining strategy to change based on the 
key (for example, MIN for
 some users and MAX for other users), you can define a `KeyedCombineFn` to 
access
 the key within the combining strategy.
 
-# 4.2.4.3. Combining a PCollection into a single value 
{#combining-pcollection}
+# 4.2. Combining a PCollection into a single value {#combining-pcollection}
 
 Use the global combine to transform all of the elements in a given 
`PCollection`
 into a single value,

[beam-site] 04/05: Fix language tab tags

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 3b4c7f011e0196e9a4e0f1e90d28292ee850623a
Author: Melissa Pashniak 
AuthorDate: Mon Aug 20 12:16:39 2018 -0700

Fix language tab tags
---
 src/documentation/programming-guide.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 15762b1..bd7c304 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -1519,7 +1519,9 @@ together.
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py
 tag:model_pardo_with_undeclared_outputs
 %}```
 
+{:.language-java}
  4.5.3. Accessing additional parameters in your DoFn 
{#other-dofn-parameters}
+
 {:.language-java}
 In addition to the element and the `OutputReceiver`, Beam will populate other 
parameters to your DoFn's `@ProcessElement` method.
 Any combination of these parameters can be added to your process method in any 
order.
@@ -1569,7 +1571,7 @@ The `PipelineOptions` for the current pipeline can always 
be accessed in a proce
   }})
 ```
 
-{::.language-java}
+{:.language-java}
 `@OnTimer` methods can also access many of these parameters. Timestamp, 
window, `PipelineOptions`, `OutputReceiver`, and
 `MultiOutputReceiver` parameters can all be accessed in an `@OnTimer` method. 
In addition, an `@OnTimer` method can take
 a parameter of type `TimeDomain` which tells whether the timer is based on 
event time or processing time.

[beam-site] 02/05: Address comments.

2018-08-20 Thread mergebot-role

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 18f65424a5bbda1b209c3a3ca8f152a14c2484fa
Author: Reuven Lax 
AuthorDate: Mon Jun 18 07:08:17 2018 -0700

Address comments.
---
 src/documentation/programming-guide.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index aca4712..ca96472 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -623,7 +623,7 @@ static class ComputeWordLengthFn extends DoFn { ... }
 Inside your `DoFn` subclass, you'll write a method annotated with
 `@ProcessElement` where you provide the actual processing logic. You don't need
 to manually extract the elements from the input collection; the Beam SDKs 
handle
-that for you. Your `@ProcessElement` method should accept parameter tagged with
+that for you. Your `@ProcessElement` method should accept a parameter tagged 
with
 `@Element`, which will be populated with the input element. In order to output
 elements, the method can also take a parameter of type `OutputReceiver` which
 provides a method for emitting elements. The parameter types must match the 
input
@@ -697,7 +697,7 @@ PCollection wordLengths = words.apply(
   "ComputeWordLengths", // the transform name
   ParDo.of(new DoFn() {// a DoFn as an anonymous inner 
class instance
   @ProcessElement
-  public void processElement(@Element String word, OutputReceiver out) {
+  public void processElement(@Element String word, OutputReceiver 
out) {
 out.output(word.length());
   }
 }));
@@ -1043,7 +1043,7 @@ you need the combining strategy to change based on the 
key (for example, MIN for
 some users and MAX for other users), you can define a `KeyedCombineFn` to 
access
 the key within the combining strategy.
 
-# 4.2. Combining a PCollection into a single value {#combining-pcollection}
+# 4.2.4.3. Combining a PCollection into a single value 
{#combining-pcollection}
 
 Use the global combine to transform all of the elements in a given 
`PCollection`
 into a single value, represented in your pipeline as a new `PCollection`
@@ -1264,8 +1264,8 @@ In general, your user code must fulfill at least these 
requirements:
   Beam SDKs are not thread-safe*.
 
 In addition, it's recommended that you make your function object 
**idempotent**.
-Non-idempotent functions are supported by Beam, but xN require additional
-thought.
+Non-idempotent functions are supported by Beam, but require additional
+thought to ensure correctness when there are external side effects.
 
 > **Note:** These requirements apply to subclasses of `DoFn` (a function object
 > used with the [ParDo](#pardo) transform), `CombineFn` (a function object used
@@ -1307,7 +1307,7 @@ function may be accessed from different threads.
 
 It's recommended that you make your function object idempotent--that is, that 
it
 can be repeated or retried as often as necessary without causing unintended 
side
-effects. Non-idempotent functions are supported, however the beam model 
provides
+effects. Non-idempotent functions are supported, however the Beam model 
provides
 no guarantees as to the number of times your user code might be invoked or 
retried;
 as such, keeping your function object idempotent keeps your pipeline's output
 deterministic, and your transforms' behavior more predictable and easier to 
debug.
@@ -1517,7 +1517,7 @@ together.
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py
 tag:model_pardo_with_undeclared_outputs
 %}```
 
- 4.5.3. Accessing othe parameters in your DoFn {#other-dofn-parameters}
+ 4.5.3. Accessing additional parameters in your DoFn 
{#other-dofn-parameters}
 {:.language-java}
 In addition to the element and the `OutputReceiver`, Beam will populate other 
parameters to your DoFn's `@ProcessElement` method.
 Any combination of these parameters can be added to your process method in any 
order.
@@ -1560,7 +1560,7 @@ you can determine whether this is an early or a late 
firing, and how many times
 
 {:.language-java}
 **PipelineOptions:**
-The `PipelineOptions` for the current pipeline can always be accessed in a 
process method by adding it as a paramet:
+The `PipelineOptions` for the current pipeline can always be accessed in a 
process method by adding it as a parameter:
 ```java
 .of(new DoFn() {
  public void processElement(@Element String word, PipelineOptions options) 
{

1 2 3 >

1 - 100 of 262 matches

Mail list logo