Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1492

2018-05-01 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam23 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 96fb76560de44abcdcb915da543df7d45f6f6b3b (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 96fb76560de44abcdcb915da543df7d45f6f6b3b
Commit message: "[BEAM-4131] Include SDK into Python SDK harness container."
 > git rev-list --no-walk 96fb76560de44abcdcb915da543df7d45f6f6b3b # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins6175827015196876227.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in /usr/lib/python2.7/dist-packages 
(15.0.1)

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
sdks/python/run_validatesrunner.sh: line 38: 
/home/jenkins/.local/bin//virtualenv: No such file or directory
Build step 'Execute shell' marked build as failure
Not sending mail to unregistered user sweg...@google.com
Not sending mail to unregistered user pger...@us.ibm.com
Not sending mail to unregistered user aal...@gmail.com
Not sending mail to unregistered user sid...@google.com
Not sending mail to unregistered user katarzyna.kucharc...@polidea.com
Not sending mail to unregistered user ankurgoe...@gmail.com
Not sending mail to unregistered user hero...@google.com
Not sending mail to unregistered user ro...@frantil.com
Not sending mail to unregistered user w...@google.com
Not sending mail to unregistered user szewi...@gmail.com
Not sending mail to unregistered user git...@alasdairhodge.co.uk
Not sending mail to unregistered user ke...@google.com
Not sending mail to unregistered user ekirpic...@gmail.com
Not sending mail to unregistered user aljoscha.kret...@gmail.com
Not sending mail to unregistered user apill...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro2.roam.corp.google.com
Not sending mail to unregistered user kirpic...@google.com


[jira] [Assigned] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-05-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reassigned BEAM-3714:
--

Assignee: Jean-Baptiste Onofré  (was: Innocent)

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1215

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 4.44 KB...]
  Using cached 
https://files.pythonhosted.org/packages/a2/71/8273a7eeed0aff6a854237ab5453bc9aa67deb49df4832801c21f0ff3782/contextlib2-0.5.5-py2.py3-none-any.whl
Collecting pywinrm (from -r PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/0d/12/13a3117bbd2230043aa32dcfa2198c33269665eaa1a8fa26174ce49b338f/pywinrm-0.3.0-py2.py3-none-any.whl
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14)) (1.11.0)
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15)) (1.0)
Collecting colorama; extra == "windows" (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
  Using cached 
https://files.pythonhosted.org/packages/db/c8/7dcf9dbcb22429512708fe3a547f8b6101c0d02137acbd892505aee57adf/colorama-0.3.9-py2.py3-none-any.whl
Collecting requests-ntlm>=0.3.0 (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/03/4b/8b9a1afde8072c4d5710d9fa91433d504325821b038e00237dc8d6d833dc/requests_ntlm-1.1.0-py2.py3-none-any.whl
Requirement already satisfied: requests>=2.9.1 in 
/usr/local/lib/python2.7/dist-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.18.4)
Collecting xmltodict (from pywinrm->-r PerfKitBenchmarker/requirements.txt 
(line 25))
  Using cached 
https://files.pythonhosted.org/packages/42/a9/7e99652c6bc619d19d58cdd8c47560730eb5825d43a7e25db2e1d776ceb7/xmltodict-0.11.0-py2.py3-none-any.whl
Requirement already satisfied: cryptography>=1.3 in 
/usr/local/lib/python2.7/dist-packages (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.2.2)
Collecting ntlm-auth>=1.0.2 (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/69/bc/230987c0dc22c763529330b2e669dbdba374d6a10c1f61232274184731be/ntlm_auth-1.1.0-py2.py3-none-any.whl
Requirement already satisfied: certifi>=2017.4.17 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2018.4.16)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (3.0.4)
Requirement already satisfied: idna<2.7,>=2.5 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.6)
Requirement already satisfied: urllib3<1.23,>=1.21.1 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.22)
Requirement already satisfied: cffi>=1.7; platform_python_implementation != 
"PyPy" in /usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.11.5)
Requirement already satisfied: enum34; python_version < "3" in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.1.6)
Requirement already satisfied: asn1crypto>=0.21.0 in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (0.24.0)
Requirement already satisfied: ipaddress; python_version < "3" in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.0.22)
Requirement already satisfied: pycparser in 
/usr/local/lib/python2.7/dist-packages (from cffi>=1.7; 
platform_python_implementation != 
"PyPy"->cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.18)
Installing collected packages: absl-py, colorama, colorlog, blinker, futures, 
pint, numpy, contextlib2, ntlm-auth, requests-ntlm, xmltodict, pywinrm
Successfully installed absl-py-0.2.0 blinker-1.4 colorama-0.3.9 colorlog-2.6.0 
contextlib2-0.5.5 futures-3.2.0 ntlm-auth-1.1.0 numpy-1.13.3 pint-0.8.1 
pywinrm-0.3.0 requests-ntlm-1.1.0 xmltodict-0.11.0
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4799797187760619386.sh
+ .env/bin/pip install -e 'src/sdks/python/[gcp,test]'
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/usr/lib/python2.7/dist-packages (from apache-beam==2.5.0.dev0) (1.7)
Collecting dill==0.2.6 (from apache-beam==2.5.

Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT_HDFS #116

2018-05-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_XmlIOIT_HDFS #115

2018-05-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #117

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 874.54 KB...]
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=35.226.49.216:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=35.226.49.216:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=35.226.49.216:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at 

Build failed in Jenkins: beam_PerformanceTests_TextIOIT_HDFS #122

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 1.11 MB...]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy62.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy63.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn$Do

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT_HDFS #116

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 1.33 MB...]
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy62.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy63.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:118)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn$1

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #245

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 18.41 MB...]
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:37.907Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Reify
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Partial
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:37.929Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:37.951Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Write
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Reify
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:37.975Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Partial
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:37.999Z: Fusing consumer 
PAssert$3/CreateActual/ParDo(Anonymous) into 
PAssert$3/CreateActual/RewindowActuals/Window.Assign
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.023Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into PAssert$3/CreateActual/ParDo(Anonymous)
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.053Z: Fusing consumer 
Combine.globally(Count)/ProduceDefault into 
Combine.globally(Count)/CreateVoid/Read(CreateSource)
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.086Z: Fusing consumer 
Combine.globally(Count)/View.AsIterable/ParDo(ToIsmRecordForGlobalWindow) into 
Combine.globally(Count)/Values/Values/Map
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.119Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues into 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Read
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.159Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Partial
 into Combine.globally(Count)/WithKeys/AddKeys/Map
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.190Z: Fusing consumer 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/GroupByWindow into 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/Read
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.214Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Reify into 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Partial
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T12:49:38.238Z: Fusing consumer 
Combine.globally(Count)/WithKeys/AddKeys/Map into DatastoreV1.Read/Read
May 01, 2018 12:49:45 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1493

2018-05-01 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam23 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 96fb76560de44abcdcb915da543df7d45f6f6b3b (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 96fb76560de44abcdcb915da543df7d45f6f6b3b
Commit message: "[BEAM-4131] Include SDK into Python SDK harness container."
 > git rev-list --no-walk 96fb76560de44abcdcb915da543df7d45f6f6b3b # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins6091071962498451368.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in /usr/lib/python2.7/dist-packages 
(15.0.1)

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
sdks/python/run_validatesrunner.sh: line 38: 
/home/jenkins/.local/bin//virtualenv: No such file or directory
Build step 'Execute shell' marked build as failure
Not sending mail to unregistered user sweg...@google.com
Not sending mail to unregistered user pger...@us.ibm.com
Not sending mail to unregistered user aal...@gmail.com
Not sending mail to unregistered user sid...@google.com
Not sending mail to unregistered user katarzyna.kucharc...@polidea.com
Not sending mail to unregistered user ankurgoe...@gmail.com
Not sending mail to unregistered user hero...@google.com
Not sending mail to unregistered user ro...@frantil.com
Not sending mail to unregistered user w...@google.com
Not sending mail to unregistered user szewi...@gmail.com
Not sending mail to unregistered user git...@alasdairhodge.co.uk
Not sending mail to unregistered user ke...@google.com
Not sending mail to unregistered user ekirpic...@gmail.com
Not sending mail to unregistered user aljoscha.kret...@gmail.com
Not sending mail to unregistered user apill...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro2.roam.corp.google.com
Not sending mail to unregistered user kirpic...@google.com


[beam-site] 04/04: This closes #432

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 1c4f5ea311ea01b6e0dc531e2c80bd885835ca7e
Merge: 67836af 6e2f71a
Author: Mergebot 
AuthorDate: Tue May 1 08:43:37 2018 -0700

This closes #432

 src/_beam_team/team.md | 12 
 1 file changed, 12 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch mergebot updated (8f89018 -> 1c4f5ea)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


 discard 8f89018  This closes #430
 discard e384683  [BEAM-4177] Clarify thread contraint in Programming Guide 
4.3.2
 new 658b5b8  Adding pabloem to the list of team members.
 new 68d2e71  Sorting by last name
 new 6e2f71a  Adding gris as committer as well.
 new 1c4f5ea  This closes #432

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (8f89018)
\
 N -- N -- N   refs/heads/mergebot (1c4f5ea)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/_beam_team/team.md | 12 
 src/documentation/programming-guide.md |  2 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/04: Sorting by last name

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 68d2e7143eb9cdf57827dd835ddf81d1ed00bec0
Author: Pablo 
AuthorDate: Sat Apr 28 11:13:21 2018 -0700

Sorting by last name
---
 src/_beam_team/team.md | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/_beam_team/team.md b/src/_beam_team/team.md
index b5dc6f5..c89c9db 100644
--- a/src/_beam_team/team.md
+++ b/src/_beam_team/team.md
@@ -50,6 +50,12 @@ members:
 organization: Google
 roles: committer, PMC
 time_zone: "-8"
+  - name: Pablo Estrada
+apache_id: pabloem
+email: pabloem [at] apache [dot] org
+organization: Google
+roles: committer
+time_zone: "-8"
   - name: Stephan Ewen
 apache_id: sewen
 email: sewen [at] apache [dot] org
@@ -194,10 +200,4 @@ members:
 organization: Alipay
 roles: committer
 time_zone: "+8"
-  - name: Pablo Estrada
-apache_id: pabloem
-email: pabloem [at] apache [dot] org
-organization: Google
-roles: committer
-time_zone: "-8"
 ---

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 03/04: Adding gris as committer as well.

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 6e2f71a79d47fd33cd9eb28a474c497cb7a3ca57
Author: Pablo 
AuthorDate: Mon Apr 30 14:49:10 2018 -0700

Adding gris as committer as well.
---
 src/_beam_team/team.md | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/_beam_team/team.md b/src/_beam_team/team.md
index c89c9db..4a475dd 100644
--- a/src/_beam_team/team.md
+++ b/src/_beam_team/team.md
@@ -44,6 +44,12 @@ members:
 organization: Talend
 roles: committer
 time_zone: "+1"
+  - name: Griselda Cuevas
+apache_id: gris
+email: gris [at] apache [dot] org
+organization: Google
+roles: committer
+time_zone: "-8"
   - name: Luke Cwik
 apache_id: lcwik
 email: lcwik [at] apache [dot] org

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 01/04: Adding pabloem to the list of team members.

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 658b5b8c12a948b90f41d14a58bf4631d1561b24
Author: Pablo 
AuthorDate: Fri Apr 27 15:07:28 2018 -0700

Adding pabloem to the list of team members.
---
 src/_beam_team/team.md | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/_beam_team/team.md b/src/_beam_team/team.md
index 3ef6ae6..b5dc6f5 100644
--- a/src/_beam_team/team.md
+++ b/src/_beam_team/team.md
@@ -194,4 +194,10 @@ members:
 organization: Alipay
 roles: committer
 time_zone: "+8"
+  - name: Pablo Estrada
+apache_id: pabloem
+email: pabloem [at] apache [dot] org
+organization: Google
+roles: committer
+time_zone: "-8"
 ---

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 01/01: Prepare repository for deployment.

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 6f36ad26814a5634a1a4285d5862ac8dea640e3b
Author: Mergebot 
AuthorDate: Tue May 1 08:46:55 2018 -0700

Prepare repository for deployment.
---
 content/contribute/team/index.html | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/content/contribute/team/index.html 
b/content/contribute/team/index.html
index daba33d..ec763d9 100644
--- a/content/contribute/team/index.html
+++ b/content/contribute/team/index.html
@@ -237,6 +237,15 @@
 
   
 
+  Griselda Cuevas
+  gris
+  gris [at] apache [dot] org
+  Google
+  committer
+  -8
+
+  
+
   Luke Cwik
   lcwik
   lcwik [at] apache [dot] org
@@ -246,6 +255,15 @@
 
   
 
+  Pablo Estrada
+  pabloem
+  pabloem [at] apache [dot] org
+  Google
+  committer
+  -8
+
+  
+
   Stephan Ewen
   sewen
   sewen [at] apache [dot] org

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch asf-site updated (67836af -> 6f36ad2)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 67836af  Prepare repository for deployment.
 add 658b5b8  Adding pabloem to the list of team members.
 add 68d2e71  Sorting by last name
 add 6e2f71a  Adding gris as committer as well.
 add 1c4f5ea  This closes #432
 new 6f36ad2  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/team/index.html | 18 ++
 src/_beam_team/team.md | 12 
 2 files changed, 30 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam] branch master updated (96fb765 -> d5b0316)

2018-05-01 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 96fb765  [BEAM-4131] Include SDK into Python SDK harness container.
 add 59a4a99  Mark some ReduceFnRunner arguments Nullable
 new d5b0316  Merge pull request #5237: Mark some ReduceFnRunner arguments 
Nullable

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/beam/runners/core/ReduceFnContextFactory.java | 16 +++-
 .../org/apache/beam/runners/core/ReduceFnRunner.java | 15 +++
 2 files changed, 18 insertions(+), 13 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[beam] 01/01: Merge pull request #5237: Mark some ReduceFnRunner arguments Nullable

2018-05-01 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit d5b0316060c29581291a8e12058bb4ca2677f92b
Merge: 96fb765 59a4a99
Author: Thomas Groh 
AuthorDate: Tue May 1 08:56:33 2018 -0700

Merge pull request #5237: Mark some ReduceFnRunner arguments Nullable

 .../apache/beam/runners/core/ReduceFnContextFactory.java | 16 +++-
 .../org/apache/beam/runners/core/ReduceFnRunner.java | 15 +++
 2 files changed, 18 insertions(+), 13 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1494

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

--
Started by GitHub push by tgroh
[EnvInject] - Loading node environment variables.
Building remotely on beam23 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision d5b0316060c29581291a8e12058bb4ca2677f92b (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d5b0316060c29581291a8e12058bb4ca2677f92b
Commit message: "Merge pull request #5237: Mark some ReduceFnRunner arguments 
Nullable"
 > git rev-list --no-walk 96fb76560de44abcdcb915da543df7d45f6f6b3b # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins7102162756890609619.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in /usr/lib/python2.7/dist-packages 
(15.0.1)

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
sdks/python/run_validatesrunner.sh: line 38: 
/home/jenkins/.local/bin//virtualenv: No such file or directory
Build step 'Execute shell' marked build as failure
Not sending mail to unregistered user sweg...@google.com
Not sending mail to unregistered user pger...@us.ibm.com
Not sending mail to unregistered user aal...@gmail.com
Not sending mail to unregistered user sid...@google.com
Not sending mail to unregistered user katarzyna.kucharc...@polidea.com
Not sending mail to unregistered user ankurgoe...@gmail.com
Not sending mail to unregistered user hero...@google.com
Not sending mail to unregistered user ro...@frantil.com
Not sending mail to unregistered user w...@google.com
Not sending mail to unregistered user szewi...@gmail.com
Not sending mail to unregistered user git...@alasdairhodge.co.uk
Not sending mail to unregistered user ke...@google.com
Not sending mail to unregistered user ekirpic...@gmail.com
Not sending mail to unregistered user aljoscha.kret...@gmail.com
Not sending mail to unregistered user apill...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro2.roam.corp.google.com
Not sending mail to unregistered user kirpic...@google.com


[jira] [Commented] (BEAM-4191) WatermarkManagerTest has had a masked failure (both copies)

2018-05-01 Thread Thomas Groh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459811#comment-16459811
 ] 

Thomas Groh commented on BEAM-4191:
---

 

Can you provide any more context? the linked PR doesn't seem to expose anything 
in {{WatermarkManagerTest}}

> WatermarkManagerTest has had a masked failure (both copies)
> ---
>
> Key: BEAM-4191
> URL: https://issues.apache.org/jira/browse/BEAM-4191
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>Priority: Major
>  Labels: sickbay
>
> Discovered in https://github.com/apache/beam/pull/4929, the test should be 
> sickbayed while the code is made correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-05-01 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459818#comment-16459818
 ] 

Eugene Kirpichov commented on BEAM-3714:


JB, what's the reason for reassigning this issue to yourself? It has been 
successfully closed after having been implemented by Innocent.

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3940) Update release documentation for Gradle

2018-05-01 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner resolved BEAM-3940.

   Resolution: Fixed
Fix Version/s: Not applicable

This has been merged.

> Update release documentation for Gradle
> ---
>
> Key: BEAM-3940
> URL: https://issues.apache.org/jira/browse/BEAM-3940
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: Not applicable
>
>
> [https://beam.apache.org/contribute/release-guide/] describes using mvn for 
> release:branch, release:prepare, and release:perform.
> These should all be converted to gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #246

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

--
[...truncated 18.79 MB...]

org.apache.beam.sdk.io.gcp.datastore.V1WriteIT > testE2EV1Write STANDARD_ERROR
May 01, 2018 4:43:40 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:43:38.805Z: Workers have started successfully.

org.apache.beam.sdk.io.gcp.datastore.V1ReadIT > 
testE2EV1ReadWithGQLQueryWithNoLimit STANDARD_ERROR
May 01, 2018 4:43:42 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:43:42.281Z: Workers have started successfully.
May 01, 2018 4:44:15 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:12.764Z: Executing operation 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/Close
May 01, 2018 4:44:15 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:12.810Z: Executing operation 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/Read+DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/GroupByWindow+DatastoreV1.Read/Reshuffle/Reshuffle/ExpandIterable+DatastoreV1.Read/Reshuffle/Values/Values/Map+DatastoreV1.Read/Read+Combine.globally(Count)/WithKeys/AddKeys/Map+Combine.globally(Count)/Combine.perKey(Count)/GroupByKey+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Partial+Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Reify+Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Write

org.apache.beam.sdk.io.gcp.datastore.V1WriteIT > testE2EV1Write STANDARD_ERROR
May 01, 2018 4:44:16 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:15.404Z: Cleaning up.
May 01, 2018 4:44:16 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:15.464Z: Stopping worker pool...

org.apache.beam.sdk.io.gcp.datastore.V1ReadIT > 
testE2EV1ReadWithGQLQueryWithNoLimit STANDARD_ERROR
May 01, 2018 4:44:24 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:23.002Z: Executing operation 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Close
May 01, 2018 4:44:24 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:23.146Z: Executing operation 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Read+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Extract+Combine.globally(Count)/Values/Values/Map+PAssert$3/CreateActual/FilterActuals/Window.Assign+PAssert$3/CreateActual/GatherPanes/Reify.Window/ParDo(Anonymous)+PAssert$3/CreateActual/GatherPanes/WithKeys/AddKeys/Map+PAssert$3/CreateActual/GatherPanes/Window.Into()/Window.Assign+PAssert$3/CreateActual/GatherPanes/GroupByKey/Reify+PAssert$3/CreateActual/GatherPanes/GroupByKey/Write+Combine.globally(Count)/View.AsIterable/ParDo(ToIsmRecordForGlobalWindow)
May 01, 2018 4:44:36 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:34.875Z: Executing operation 
Combine.globally(Count)/View.AsIterable/CreateDataflowView
May 01, 2018 4:44:36 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:35.067Z: Executing operation 
Combine.globally(Count)/CreateVoid/Read(CreateSource)+Combine.globally(Count)/ProduceDefault+PAssert$3/CreateActual/FilterActuals/Window.Assign+PAssert$3/CreateActual/GatherPanes/Reify.Window/ParDo(Anonymous)+PAssert$3/CreateActual/GatherPanes/WithKeys/AddKeys/Map+PAssert$3/CreateActual/GatherPanes/Window.Into()/Window.Assign+PAssert$3/CreateActual/GatherPanes/GroupByKey/Reify+PAssert$3/CreateActual/GatherPanes/GroupByKey/Write
May 01, 2018 4:44:36 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:36.302Z: Executing operation 
PAssert$3/CreateActual/GatherPanes/GroupByKey/Close
May 01, 2018 4:44:36 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T16:44:36.432Z: Executing operation 
PAssert$3/CreateActual/GatherPanes/GroupByKey/Read+PAssert$3/CreateActual/GatherPanes/GroupByKey/GroupByWindow+PAssert$3/CreateActual/GatherPanes/Values/Values/Map+PAssert$3/CreateActual/ExtractPane/Map+PAssert$3/CreateActual/Flatten.Iterables/FlattenIterables/FlatMap+PAssert$3/CreateActual/RewindowActuals/Window.Assign+PAssert$3/CreateActual/ParDo(Anonymous)+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1495

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[Pablo] Add documentaton for quickstart tasks

--
Started by GitHub push by pabloem
[EnvInject] - Loading node environment variables.
Building remotely on beam23 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 92fd475afca09da7da1224775342bd668b53d83a (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 92fd475afca09da7da1224775342bd668b53d83a
Commit message: "Add documentaton for quickstart tasks"
 > git rev-list --no-walk d5b0316060c29581291a8e12058bb4ca2677f92b # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins8566716041729701139.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in /usr/lib/python2.7/dist-packages 
(15.0.1)

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
sdks/python/run_validatesrunner.sh: line 38: 
/home/jenkins/.local/bin//virtualenv: No such file or directory
Build step 'Execute shell' marked build as failure
Not sending mail to unregistered user sweg...@google.com
Not sending mail to unregistered user pger...@us.ibm.com
Not sending mail to unregistered user aal...@gmail.com
Not sending mail to unregistered user sid...@google.com
Not sending mail to unregistered user katarzyna.kucharc...@polidea.com
Not sending mail to unregistered user ankurgoe...@gmail.com
Not sending mail to unregistered user hero...@google.com
Not sending mail to unregistered user ro...@frantil.com
Not sending mail to unregistered user w...@google.com
Not sending mail to unregistered user szewi...@gmail.com
Not sending mail to unregistered user git...@alasdairhodge.co.uk
Not sending mail to unregistered user ke...@google.com
Not sending mail to unregistered user ekirpic...@gmail.com
Not sending mail to unregistered user aljoscha.kret...@gmail.com
Not sending mail to unregistered user apill...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro2.roam.corp.google.com
Not sending mail to unregistered user kirpic...@google.com


[jira] [Commented] (BEAM-4191) WatermarkManagerTest has had a masked failure (both copies)

2018-05-01 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459880#comment-16459880
 ] 

Kenneth Knowles commented on BEAM-4191:
---

Cloned the wrong JIRA. It is https://github.com/apache/beam/pull/5161

> WatermarkManagerTest has had a masked failure (both copies)
> ---
>
> Key: BEAM-4191
> URL: https://issues.apache.org/jira/browse/BEAM-4191
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>Priority: Major
>  Labels: sickbay
>
> Discovered in https://github.com/apache/beam/pull/4929, the test should be 
> sickbayed while the code is made correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4191) WatermarkManagerTest has had a masked failure (both copies)

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4191:
--
Description: Discovered in https://github.com/apache/beam/pull/5161, the 
test should be sickbayed while the code is made correct.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> WatermarkManagerTest has had a masked failure (both copies)
> ---
>
> Key: BEAM-4191
> URL: https://issues.apache.org/jira/browse/BEAM-4191
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>Priority: Major
>  Labels: sickbay
>
> Discovered in https://github.com/apache/beam/pull/5161, the test should be 
> sickbayed while the code is made correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=97150&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97150
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 01/May/18 17:19
Start Date: 01/May/18 17:19
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4943: [BEAM-3906] Automate 
Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943#issuecomment-384811849
 
 
   Run Standalone Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97150)
Time Spent: 17h  (was: 16h 50m)

> Get Python Wheel Validation Automated
> -
>
> Key: BEAM-3906
> URL: https://issues.apache.org/jira/browse/BEAM-3906
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-python, testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 17h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4142) HadoopResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4142:
--
Description: Sickbayed in https://github.com/apache/beam/pull/5161, the 
test should be fixed and no longer ignored.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> HadoopResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4142
> URL: https://issues.apache.org/jira/browse/BEAM-4142
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-hadoop
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4143) GcsResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4143:
--
Description: Discovered in https://github.com/apache/beam/pull/5161, the 
test should be sickbayed while the code is made correct.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> GcsResourceIdTest has had a masked failure
> --
>
> Key: BEAM-4143
> URL: https://issues.apache.org/jira/browse/BEAM-4143
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Discovered in https://github.com/apache/beam/pull/5161, the test should be 
> sickbayed while the code is made correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4184) S3ResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4184:
--
Description: Discovered in https://github.com/apache/beam/pull/5161, the 
test should be sickbayed while the code is made correct.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Discovered in https://github.com/apache/beam/pull/5161, the test should be 
> sickbayed while the code is made correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4184) S3ResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4184:
--
Description: Sickbayed in https://github.com/apache/beam/pull/5161, the 
test should be fixed and no longer ignored.  (was: Discovered in 
https://github.com/apache/beam/pull/5161, the test should be sickbayed while 
the code is made correct.)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4143) GcsResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4143:
--
Description: Sickbayed in https://github.com/apache/beam/pull/5161, the 
test should be fixed and no longer ignored.  (was: Discovered in 
https://github.com/apache/beam/pull/5161, the test should be sickbayed while 
the code is made correct.)

> GcsResourceIdTest has had a masked failure
> --
>
> Key: BEAM-4143
> URL: https://issues.apache.org/jira/browse/BEAM-4143
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4110) LocalResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-4110:
-

Assignee: (was: Kenneth Knowles)

> LocalResourceIdTest has had a masked failure
> 
>
> Key: BEAM-4110
> URL: https://issues.apache.org/jira/browse/BEAM-4110
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4144) SplittableParDoProcessFnTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4144:
--
Description: Sickbayed in https://github.com/apache/beam/pull/5161, the 
test should be fixed and no longer ignored.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> SplittableParDoProcessFnTest has had a masked failure
> -
>
> Key: BEAM-4144
> URL: https://issues.apache.org/jira/browse/BEAM-4144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Eugene Kirpichov
>Priority: Major
>  Labels: sickbay
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4110) LocalResourceIdTest has had a masked failure

2018-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4110:
--
Description: Sickbayed in https://github.com/apache/beam/pull/5161, the 
test should be fixed and no longer ignored.  (was: Discovered in 
https://github.com/apache/beam/pull/4929, the test should be sickbayed while 
the code is made correct.)

> LocalResourceIdTest has had a masked failure
> 
>
> Key: BEAM-4110
> URL: https://issues.apache.org/jira/browse/BEAM-4110
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: sickbay
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4057) Ensure generated pom don't break consumers

2018-05-01 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459887#comment-16459887
 ] 

Kenneth Knowles commented on BEAM-4057:
---

Are there known issues here? Let's make them specific and break out a JIRA for 
each one so they can be treated. If we don't have known issues, let's obsolete 
this.

> Ensure generated pom don't break consumers
> --
>
> Key: BEAM-4057
> URL: https://issues.apache.org/jira/browse/BEAM-4057
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> Out of my head here are the requirements:
> 1. dependencies are all here (all scopes and well scoped: this means that 
> provided or test dependencies are not in compile scope for instance)
> 2. META-INF should contain the pom.xml and pom.properties as maven generates 
> them (it is consumes by tools and libraries to grab the dependencies or scan 
> some classpath/lib folder)
> 3. ensure the compiler plugin at least is defined with the java 
> version+compiler flags (a usage is to check if -parameters is activated for 
> instance)
> 4. (nice to have) dont put all the boilerplate in all poms (license, etc) but 
> keep it in the parent pom as it was
> 5. (if possible) respect the hierarchy (parents) - this is used sometimes as 
> a shortcut for dependencies analyzis cause it is faster than analyzing the 
> dependencies, probably not the best practise ever but it is efficient in 
> general
> 6. ensure meta used by mainstream tools like mvnrepository are here 
> (description etc, should be a passthrough from gradle)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam-site] 04/04: This closes #423

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 53ec61b149e2ef9b1b079d953839d942accffb69
Merge: 6f36ad2 f8762f6
Author: Mergebot 
AuthorDate: Tue May 1 10:40:19 2018 -0700

This closes #423

 src/_includes/section-menu/sdks.html  |   1 +
 src/documentation/sdks/java-thirdparty.md | 100 ++
 src/documentation/sdks/java.md|   2 +
 3 files changed, 103 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/04: Fixed some layout issues

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 139141c3cae84c2423589b47f96f926291c512b7
Author: Niels Basjes 
AuthorDate: Thu Apr 19 16:37:41 2018 +0200

Fixed some layout issues
---
 src/documentation/sdks/java-extensions.md | 126 +-
 1 file changed, 71 insertions(+), 55 deletions(-)

diff --git a/src/documentation/sdks/java-extensions.md 
b/src/documentation/sdks/java-extensions.md
index 3b1524f..aeabc9f 100644
--- a/src/documentation/sdks/java-extensions.md
+++ b/src/documentation/sdks/java-extensions.md
@@ -59,16 +59,17 @@ PCollection>>> 
groupedAndSorted =
 SortValues.create(BufferedExternalSorter.options()));
 ```
 
-## Parsing Apache HTTPD and NGINX Access log files.
+## Parsing HTTPD/NGINX access logs.
 
 The Apache HTTPD webserver creates logfiles that contain valuable information 
about the requests that have been done to
-thie webserver. The format of these config files is a configuration option in 
the Apache HTTPD server so parsing this
+the webserver. The format of these config files is a configuration option in 
the Apache HTTPD server so parsing this
 into useful data elements is normally very hard to do.
 
-To solve this problem in an easy way a library was created that works in 
combination with Apache Beam.
+To solve this problem in an easy way a library was created that works in 
combination with Apache Beam
+and is capable of doing this for both the Apache HTTPD and NGINX.
 
-The basic idea is that you should be able to have a parser that you can 
construct by simply
-telling it with what configuration options the line was written.
+The basic idea is that the logformat specification is the schema used to 
create the line. 
+THis parser is simply initialized with this schema and the list of fields you 
want to extract.
 
 ### Basic usage
 Full documentation can be found here 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 
@@ -76,9 +77,9 @@ Full documentation can be found here 
[https://github.com/nielsbasjes/logparser](
 First you put something like this in your pom.xml file:
 
 
-nl.basjes.parse.httpdlog
-httpdlog-parser
-5.0
+  nl.basjes.parse.httpdlog
+  httpdlog-parser
+  5.0
 
 
 Check 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 for the latest version.
@@ -95,7 +96,7 @@ that does not have ANY @Field annotations or setters. The 
"Object" class will do
 Parser dummyParser = new HttpdLoglineParser(Object.class, 
logformat);
 List possiblePaths = dummyParser.getPossiblePaths();
 for (String path: possiblePaths) {
-System.out.println(path);
+  System.out.println(path);
 }
 
 You will get a list that looks something like this:
@@ -140,21 +141,21 @@ So we can now add to this class a setter that simply 
receives a single value as
 
 @Field("IP:connection.client.host")
 public void setIP(final String value) {
-ip = value;
+  ip = value;
 }
 
 If we really want the name of the field we can also do this
 
 @Field("STRING:request.firstline.uri.query.img")
 public void setQueryImg(final String name, final String value) {
-results.put(name, value);
+  results.put(name, value);
 }
 
 This latter form is very handy because this way we can obtain all values for a 
wildcard field
 
 @Field("STRING:request.firstline.uri.query.*")
 public void setQueryStringValues(final String name, final String value) {
-results.put(name, value);
+  results.put(name, value);
 }
 
 Instead of using the annotations on the setters we can also simply tell the 
parser the name of th setter that must be 
@@ -164,39 +165,37 @@ called when an element is found.
 parser.addParseTarget("setQueryImg",
"STRING:request.firstline.uri.query.img");
 parser.addParseTarget("setQueryStringValues",   
"STRING:request.firstline.uri.query.*");
 
-### Using this in Apache Beam
+### Example
 
 Assuming we have a String (being the full log line) comming in and an instance 
of the WebEvent class comming out
 (where the WebEvent already the has the needed setters) the final code when 
using this in an Apache Beam project 
 will end up looking something like this
-```
-PCollection filledWebEvents = input
-.apply("Extract Elements from logline",
-ParDo.of(new DoFn() {
-private Parser parser;
-
-@Setup
-public void setup() throws NoSuchMethodException {
-parser = new HttpdLoglineParser<>(WebEvent.class, 
getLogFormat());
-parser.addParseTarget("setIP",  
"IP:connection.client.host");
-parser.addParseTarget("setQueryImg",
"STRING:requ

[beam-site] 01/04: Document Java extensions for parsing Apache HTTPD logfiles and Useragent strings

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 5e7f1b2ccc3a04ccad148d25ea6204badd1c2e85
Author: Niels Basjes 
AuthorDate: Thu Apr 19 13:44:09 2018 +0200

Document Java extensions for parsing Apache HTTPD logfiles and Useragent 
strings
---
 src/documentation/sdks/java-extensions.md | 182 ++
 1 file changed, 182 insertions(+)

diff --git a/src/documentation/sdks/java-extensions.md 
b/src/documentation/sdks/java-extensions.md
index 7742345..3b1524f 100644
--- a/src/documentation/sdks/java-extensions.md
+++ b/src/documentation/sdks/java-extensions.md
@@ -58,3 +58,185 @@ PCollection>>> 
groupedAndSorted =
 grouped.apply(
 SortValues.create(BufferedExternalSorter.options()));
 ```
+
+## Parsing Apache HTTPD and NGINX Access log files.
+
+The Apache HTTPD webserver creates logfiles that contain valuable information 
about the requests that have been done to
+thie webserver. The format of these config files is a configuration option in 
the Apache HTTPD server so parsing this
+into useful data elements is normally very hard to do.
+
+To solve this problem in an easy way a library was created that works in 
combination with Apache Beam.
+
+The basic idea is that you should be able to have a parser that you can 
construct by simply
+telling it with what configuration options the line was written.
+
+### Basic usage
+Full documentation can be found here 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 
+
+First you put something like this in your pom.xml file:
+
+
+nl.basjes.parse.httpdlog
+httpdlog-parser
+5.0
+
+
+Check 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 for the latest version.
+
+Assume we have a logformat variable that looks something like this:
+
+String logformat = "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" 
\"%{User-Agent}i\"";
+
+**Step 1: What CAN we get from this line?**
+
+To figure out what values we CAN get from this line we instantiate the parser 
with a dummy class
+that does not have ANY @Field annotations or setters. The "Object" class will 
do just fine for this purpose.
+
+Parser dummyParser = new HttpdLoglineParser(Object.class, 
logformat);
+List possiblePaths = dummyParser.getPossiblePaths();
+for (String path: possiblePaths) {
+System.out.println(path);
+}
+
+You will get a list that looks something like this:
+
+IP:connection.client.host
+NUMBER:connection.client.logname
+STRING:connection.client.user
+TIME.STAMP:request.receive.time
+TIME.DAY:request.receive.time.day
+TIME.MONTHNAME:request.receive.time.monthname
+TIME.MONTH:request.receive.time.month
+TIME.YEAR:request.receive.time.year
+TIME.HOUR:request.receive.time.hour
+TIME.MINUTE:request.receive.time.minute
+TIME.SECOND:request.receive.time.second
+TIME.MILLISECOND:request.receive.time.millisecond
+TIME.ZONE:request.receive.time.timezone
+HTTP.FIRSTLINE:request.firstline
+HTTP.METHOD:request.firstline.method
+HTTP.URI:request.firstline.uri
+HTTP.QUERYSTRING:request.firstline.uri.query
+STRING:request.firstline.uri.query.*
+HTTP.PROTOCOL:request.firstline.protocol
+HTTP.PROTOCOL.VERSION:request.firstline.protocol.version
+STRING:request.status.last
+BYTESCLF:response.body.bytes
+HTTP.URI:request.referer
+HTTP.QUERYSTRING:request.referer.query
+STRING:request.referer.query.*
+HTTP.USERAGENT:request.user-agent
+
+Now some of these lines contain a * .
+This is a wildcard that can be replaced with any 'name' if you need a specific 
value.
+You can also leave the '*' and get everything that is found in the actual log 
line.
+
+**Step 2 Create the receiving POJO**
+
+We need to create the receiving record class that is simply a POJO that does 
not need any interface or inheritance.
+In this class we create setters that will be called when the specified field 
has been found in the line.
+
+So we can now add to this class a setter that simply receives a single value 
as specified using the @Field annotation:
+
+@Field("IP:connection.client.host")
+public void setIP(final String value) {
+ip = value;
+}
+
+If we really want the name of the field we can also do this
+
+@Field("STRING:request.firstline.uri.query.img")
+public void setQueryImg(final String name, final String value) {
+results.put(name, value);
+}
+
+This latter form is very handy because this way we can obtain all values for a 
wildcard field
+
+@Field("STRING:request.firstline.uri.query.*")
+public void setQueryStringValues(final String name, final String value) {
+results.put(name, value);
+}
+
+Instead of using the annotations on the setters we can also simply tell the 
parser the name

[beam-site] branch mergebot updated (1c4f5ea -> 53ec61b)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 1c4f5ea  This closes #432
 add 6f36ad2  Prepare repository for deployment.
 new 5e7f1b2  Document Java extensions for parsing Apache HTTPD logfiles 
and Useragent strings
 new 139141c  Fixed some layout issues
 new f8762f6  Moved the 3rd party extensions to a separate page
 new 53ec61b  This closes #423

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/team/index.html|  18 ++
 src/_includes/section-menu/sdks.html  |   1 +
 src/documentation/sdks/java-thirdparty.md | 100 ++
 src/documentation/sdks/java.md|   2 +
 4 files changed, 121 insertions(+)
 create mode 100644 src/documentation/sdks/java-thirdparty.md

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 03/04: Moved the 3rd party extensions to a separate page

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit f8762f6328bcb584e5fc4b4e11bc14e5b870d195
Author: Niels Basjes 
AuthorDate: Thu Apr 26 23:08:21 2018 +0200

Moved the 3rd party extensions to a separate page
---
 src/_includes/section-menu/sdks.html  |   1 +
 src/documentation/sdks/java-extensions.md | 198 --
 src/documentation/sdks/java-thirdparty.md | 100 +++
 src/documentation/sdks/java.md|   2 +
 4 files changed, 103 insertions(+), 198 deletions(-)

diff --git a/src/_includes/section-menu/sdks.html 
b/src/_includes/section-menu/sdks.html
index faace4e..729258f 100644
--- a/src/_includes/section-menu/sdks.html
+++ b/src/_includes/section-menu/sdks.html
@@ -9,6 +9,7 @@

alt="External link.">
 
 Java 
SDK extensions
+Java 
3rd party extensions
 Nexmark 
benchmark suite
   
 
diff --git a/src/documentation/sdks/java-extensions.md 
b/src/documentation/sdks/java-extensions.md
index aeabc9f..7742345 100644
--- a/src/documentation/sdks/java-extensions.md
+++ b/src/documentation/sdks/java-extensions.md
@@ -58,201 +58,3 @@ PCollection>>> 
groupedAndSorted =
 grouped.apply(
 SortValues.create(BufferedExternalSorter.options()));
 ```
-
-## Parsing HTTPD/NGINX access logs.
-
-The Apache HTTPD webserver creates logfiles that contain valuable information 
about the requests that have been done to
-the webserver. The format of these config files is a configuration option in 
the Apache HTTPD server so parsing this
-into useful data elements is normally very hard to do.
-
-To solve this problem in an easy way a library was created that works in 
combination with Apache Beam
-and is capable of doing this for both the Apache HTTPD and NGINX.
-
-The basic idea is that the logformat specification is the schema used to 
create the line. 
-THis parser is simply initialized with this schema and the list of fields you 
want to extract.
-
-### Basic usage
-Full documentation can be found here 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 
-
-First you put something like this in your pom.xml file:
-
-
-  nl.basjes.parse.httpdlog
-  httpdlog-parser
-  5.0
-
-
-Check 
[https://github.com/nielsbasjes/logparser](https://github.com/nielsbasjes/logparser)
 for the latest version.
-
-Assume we have a logformat variable that looks something like this:
-
-String logformat = "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" 
\"%{User-Agent}i\"";
-
-**Step 1: What CAN we get from this line?**
-
-To figure out what values we CAN get from this line we instantiate the parser 
with a dummy class
-that does not have ANY @Field annotations or setters. The "Object" class will 
do just fine for this purpose.
-
-Parser dummyParser = new HttpdLoglineParser(Object.class, 
logformat);
-List possiblePaths = dummyParser.getPossiblePaths();
-for (String path: possiblePaths) {
-  System.out.println(path);
-}
-
-You will get a list that looks something like this:
-
-IP:connection.client.host
-NUMBER:connection.client.logname
-STRING:connection.client.user
-TIME.STAMP:request.receive.time
-TIME.DAY:request.receive.time.day
-TIME.MONTHNAME:request.receive.time.monthname
-TIME.MONTH:request.receive.time.month
-TIME.YEAR:request.receive.time.year
-TIME.HOUR:request.receive.time.hour
-TIME.MINUTE:request.receive.time.minute
-TIME.SECOND:request.receive.time.second
-TIME.MILLISECOND:request.receive.time.millisecond
-TIME.ZONE:request.receive.time.timezone
-HTTP.FIRSTLINE:request.firstline
-HTTP.METHOD:request.firstline.method
-HTTP.URI:request.firstline.uri
-HTTP.QUERYSTRING:request.firstline.uri.query
-STRING:request.firstline.uri.query.*
-HTTP.PROTOCOL:request.firstline.protocol
-HTTP.PROTOCOL.VERSION:request.firstline.protocol.version
-STRING:request.status.last
-BYTESCLF:response.body.bytes
-HTTP.URI:request.referer
-HTTP.QUERYSTRING:request.referer.query
-STRING:request.referer.query.*
-HTTP.USERAGENT:request.user-agent
-
-Now some of these lines contain a * .
-This is a wildcard that can be replaced with any 'name' if you need a specific 
value.
-You can also leave the '*' and get everything that is found in the actual log 
line.
-
-**Step 2 Create the receiving POJO**
-
-We need to create the receiving record class that is simply a POJO that does 
not need any interface or inheritance.
-In this class we create setters that will be called when the specified field 
has been found in the line.
-
-So we can now add to this class a setter that simply receives a single value 
as specified using the @Field annotation:
-
-@Field

[beam-site] 01/01: Prepare repository for deployment.

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit c77047075a163d2c19fc999cf717dd659eae315c
Author: Mergebot 
AuthorDate: Tue May 1 10:45:03 2018 -0700

Prepare repository for deployment.
---
 content/documentation/dsls/sql/index.html  |   1 +
 .../sdks/feature-comparison/index.html |   1 +
 .../documentation/sdks/java-extensions/index.html  |   1 +
 .../index.html | 147 ++---
 content/documentation/sdks/java/index.html |   3 +
 content/documentation/sdks/java/nexmark/index.html |   1 +
 .../documentation/sdks/python-custom-io/index.html |   1 +
 .../sdks/python-pipeline-dependencies/index.html   |   1 +
 .../sdks/python-type-safety/index.html |   1 +
 content/documentation/sdks/python/index.html   |   1 +
 10 files changed, 112 insertions(+), 46 deletions(-)

diff --git a/content/documentation/dsls/sql/index.html 
b/content/documentation/dsls/sql/index.html
index ecaf7c6..2f03a80 100644
--- a/content/documentation/dsls/sql/index.html
+++ b/content/documentation/dsls/sql/index.html
@@ -96,6 +96,7 @@

alt="External link.">
 
 Java SDK 
extensions
+Java 3rd party 
extensions
 Nexmark benchmark 
suite
   
 
diff --git a/content/documentation/sdks/feature-comparison/index.html 
b/content/documentation/sdks/feature-comparison/index.html
index f31a33d..0ed38ca 100644
--- a/content/documentation/sdks/feature-comparison/index.html
+++ b/content/documentation/sdks/feature-comparison/index.html
@@ -96,6 +96,7 @@

alt="External link.">
 
 Java SDK 
extensions
+Java 3rd party 
extensions
 Nexmark benchmark 
suite
   
 
diff --git a/content/documentation/sdks/java-extensions/index.html 
b/content/documentation/sdks/java-extensions/index.html
index 880b726..4496ff3 100644
--- a/content/documentation/sdks/java-extensions/index.html
+++ b/content/documentation/sdks/java-extensions/index.html
@@ -96,6 +96,7 @@

alt="External link.">
 
 Java SDK 
extensions
+Java 3rd party 
extensions
 Nexmark benchmark 
suite
   
 
diff --git a/content/documentation/sdks/java-extensions/index.html 
b/content/documentation/sdks/java-thirdparty/index.html
similarity index 67%
copy from content/documentation/sdks/java-extensions/index.html
copy to content/documentation/sdks/java-thirdparty/index.html
index 880b726..0967e4c 100644
--- a/content/documentation/sdks/java-extensions/index.html
+++ b/content/documentation/sdks/java-thirdparty/index.html
@@ -4,7 +4,7 @@
   
   
   
-  Beam Java SDK Extensions
+  Beam 3rd Party Java Extensions
   
   https://fonts.googleapis.com/css?family=Roboto:100,300,400"; 
rel="stylesheet">
@@ -15,7 +15,7 @@
   
   
   
-  https://beam.apache.org/documentation/sdks/java-extensions/"; 
data-proofer-ignore>
+  https://beam.apache.org/documentation/sdks/java-thirdparty/"; 
data-proofer-ignore>
   
   https://beam.apache.org/feed.xml";>
   

[beam-site] branch asf-site updated (6f36ad2 -> c770470)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 6f36ad2  Prepare repository for deployment.
 add 5e7f1b2  Document Java extensions for parsing Apache HTTPD logfiles 
and Useragent strings
 add 139141c  Fixed some layout issues
 add f8762f6  Moved the 3rd party extensions to a separate page
 add 53ec61b  This closes #423
 new c770470  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/dsls/sql/index.html  |   1 +
 .../sdks/feature-comparison/index.html |   1 +
 .../documentation/sdks/java-extensions/index.html  |   1 +
 .../index.html | 147 ++---
 content/documentation/sdks/java/index.html |   3 +
 content/documentation/sdks/java/nexmark/index.html |   1 +
 .../documentation/sdks/python-custom-io/index.html |   1 +
 .../sdks/python-pipeline-dependencies/index.html   |   1 +
 .../sdks/python-type-safety/index.html |   1 +
 content/documentation/sdks/python/index.html   |   1 +
 src/_includes/section-menu/sdks.html   |   1 +
 src/documentation/sdks/java-thirdparty.md  | 100 ++
 src/documentation/sdks/java.md |   2 +
 13 files changed, 215 insertions(+), 46 deletions(-)
 copy content/documentation/sdks/{java-extensions => 
java-thirdparty}/index.html (67%)
 create mode 100644 src/documentation/sdks/java-thirdparty.md

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[jira] [Work logged] (BEAM-4177) Clarify thread constraints in Programming Guide 4.3.2

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4177?focusedWorklogId=97157&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97157
 ]

ASF GitHub Bot logged work on BEAM-4177:


Author: ASF GitHub Bot
Created on: 01/May/18 17:48
Start Date: 01/May/18 17:48
Worklog Time Spent: 10m 
  Work Description: melap commented on issue #430: [BEAM-4177] Clarify 
thread contraint in Programming Guide 4.3.2
URL: https://github.com/apache/beam-site/pull/430#issuecomment-385738105
 
 
   @asfgit merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97157)
Time Spent: 1h 20m  (was: 1h 10m)

> Clarify thread constraints in Programming Guide 4.3.2
> -
>
> Key: BEAM-4177
> URL: https://issues.apache.org/jira/browse/BEAM-4177
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: John MacMillan
>Assignee: Melissa Pashniak
>Priority: Trivial
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In 
> [4.3.2|https://beam.apache.org/documentation/programming-guide/#user-code-thread-compatibility],
>  the sentence "Each instance of your function object is accessed by a single 
> thread on a worker instance, unless you explicitly create your own threads" 
> leaves some ambiguity about whether or not the instance is permanently tied 
> to a single thread, or just restricted to being active on a single thread at 
> a time.
> I suggest a minor change to:
> Each instance of your function object is accessed by a single thread at a 
> time on a worker instance, unless you explicitly create your own threads.
> I will prepare a pull request with this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam-site] 01/02: [BEAM-4177] Clarify thread contraint in Programming Guide 4.3.2

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 709841aff329b58fe9cd670c32edc9d5f96d1763
Author: John MacMillan 
AuthorDate: Thu Apr 26 10:32:50 2018 -0400

[BEAM-4177] Clarify thread contraint in Programming Guide 4.3.2

As checked on the dev mailing list, the instance will only be active on a
single thread at a time, not the more restrictive single thread ever.

Without clarification, some developers may conservatively assume the 
constraint
is more restrictive than it really is.
---
 src/documentation/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 31db9be..b94fbee 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -1281,7 +1281,7 @@ Some other serializability factors you should keep in 
mind are:
  4.3.2. Thread-compatibility {#user-code-thread-compatibility}
 
 Your function object should be thread-compatible. Each instance of your 
function
-object is accessed by a single thread on a worker instance, unless you
+object is accessed by a single thread at a time on a worker instance, unless 
you
 explicitly create your own threads. Note, however, that **the Beam SDKs are not
 thread-safe**. If you create your own threads in your user code, you must
 provide your own synchronization. Note that static members in your function

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/02: This closes #430

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 1daf7f237af78b9cae54e9e095a9c5c0f7c42f11
Merge: c770470 709841a
Author: Mergebot 
AuthorDate: Tue May 1 10:49:58 2018 -0700

This closes #430

 src/documentation/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch mergebot updated (53ec61b -> 1daf7f2)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 53ec61b  This closes #423
 add c770470  Prepare repository for deployment.
 new 709841a  [BEAM-4177] Clarify thread contraint in Programming Guide 
4.3.2
 new 1daf7f2  This closes #430

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/dsls/sql/index.html  |   1 +
 .../sdks/feature-comparison/index.html |   1 +
 .../documentation/sdks/java-extensions/index.html  |   1 +
 .../index.html | 147 ++---
 content/documentation/sdks/java/index.html |   3 +
 content/documentation/sdks/java/nexmark/index.html |   1 +
 .../documentation/sdks/python-custom-io/index.html |   1 +
 .../sdks/python-pipeline-dependencies/index.html   |   1 +
 .../sdks/python-type-safety/index.html |   1 +
 content/documentation/sdks/python/index.html   |   1 +
 src/documentation/programming-guide.md |   2 +-
 11 files changed, 113 insertions(+), 47 deletions(-)
 copy content/documentation/sdks/{java-extensions => 
java-thirdparty}/index.html (67%)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[jira] [Work logged] (BEAM-4177) Clarify thread constraints in Programming Guide 4.3.2

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4177?focusedWorklogId=97160&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97160
 ]

ASF GitHub Bot logged work on BEAM-4177:


Author: ASF GitHub Bot
Created on: 01/May/18 17:55
Start Date: 01/May/18 17:55
Worklog Time Spent: 10m 
  Work Description: asfgit commented on issue #430: [BEAM-4177] Clarify 
thread contraint in Programming Guide 4.3.2
URL: https://github.com/apache/beam-site/pull/430#issuecomment-385740018
 
 
   Error: PR failed in verification; check the Jenkins job for more information.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97160)
Time Spent: 1.5h  (was: 1h 20m)

> Clarify thread constraints in Programming Guide 4.3.2
> -
>
> Key: BEAM-4177
> URL: https://issues.apache.org/jira/browse/BEAM-4177
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: John MacMillan
>Assignee: Melissa Pashniak
>Priority: Trivial
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In 
> [4.3.2|https://beam.apache.org/documentation/programming-guide/#user-code-thread-compatibility],
>  the sentence "Each instance of your function object is accessed by a single 
> thread on a worker instance, unless you explicitly create your own threads" 
> leaves some ambiguity about whether or not the instance is permanently tied 
> to a single thread, or just restricted to being active on a single thread at 
> a time.
> I suggest a minor change to:
> Each instance of your function object is accessed by a single thread at a 
> time on a worker instance, unless you explicitly create your own threads.
> I will prepare a pull request with this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4177) Clarify thread constraints in Programming Guide 4.3.2

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4177?focusedWorklogId=97162&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97162
 ]

ASF GitHub Bot logged work on BEAM-4177:


Author: ASF GitHub Bot
Created on: 01/May/18 17:56
Start Date: 01/May/18 17:56
Worklog Time Spent: 10m 
  Work Description: melap commented on issue #430: [BEAM-4177] Clarify 
thread contraint in Programming Guide 4.3.2
URL: https://github.com/apache/beam-site/pull/430#issuecomment-385740318
 
 
   @asfgit merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97162)
Time Spent: 1h 40m  (was: 1.5h)

> Clarify thread constraints in Programming Guide 4.3.2
> -
>
> Key: BEAM-4177
> URL: https://issues.apache.org/jira/browse/BEAM-4177
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: John MacMillan
>Assignee: Melissa Pashniak
>Priority: Trivial
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In 
> [4.3.2|https://beam.apache.org/documentation/programming-guide/#user-code-thread-compatibility],
>  the sentence "Each instance of your function object is accessed by a single 
> thread on a worker instance, unless you explicitly create your own threads" 
> leaves some ambiguity about whether or not the instance is permanently tied 
> to a single thread, or just restricted to being active on a single thread at 
> a time.
> I suggest a minor change to:
> Each instance of your function object is accessed by a single thread at a 
> time on a worker instance, unless you explicitly create your own threads.
> I will prepare a pull request with this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam-site] 01/02: [BEAM-4177] Clarify thread contraint in Programming Guide 4.3.2

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 03b7de1556c91d24555b560f0a4a262b01a15855
Author: John MacMillan 
AuthorDate: Thu Apr 26 10:32:50 2018 -0400

[BEAM-4177] Clarify thread contraint in Programming Guide 4.3.2

As checked on the dev mailing list, the instance will only be active on a
single thread at a time, not the more restrictive single thread ever.

Without clarification, some developers may conservatively assume the 
constraint
is more restrictive than it really is.
---
 src/documentation/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 31db9be..b94fbee 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -1281,7 +1281,7 @@ Some other serializability factors you should keep in 
mind are:
  4.3.2. Thread-compatibility {#user-code-thread-compatibility}
 
 Your function object should be thread-compatible. Each instance of your 
function
-object is accessed by a single thread on a worker instance, unless you
+object is accessed by a single thread at a time on a worker instance, unless 
you
 explicitly create your own threads. Note, however, that **the Beam SDKs are not
 thread-safe**. If you create your own threads in your user code, you must
 provide your own synchronization. Note that static members in your function

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch mergebot updated (1daf7f2 -> f7972c2)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


 discard 1daf7f2  This closes #430
 discard 709841a  [BEAM-4177] Clarify thread contraint in Programming Guide 
4.3.2
 new 03b7de1  [BEAM-4177] Clarify thread contraint in Programming Guide 
4.3.2
 new f7972c2  This closes #430

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (1daf7f2)
\
 N -- N -- N   refs/heads/mergebot (f7972c2)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/02: This closes #430

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit f7972c279e1526db63ae9abe132772e70bbf484f
Merge: c770470 03b7de1
Author: Mergebot 
AuthorDate: Tue May 1 10:57:39 2018 -0700

This closes #430

 src/documentation/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[jira] [Work logged] (BEAM-3883) Python SDK stages artifacts when talking to job server

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3883?focusedWorklogId=97166&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97166
 ]

ASF GitHub Bot logged work on BEAM-3883:


Author: ASF GitHub Bot
Created on: 01/May/18 18:02
Start Date: 01/May/18 18:02
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5251: [BEAM-3883] Refactor 
and clean dependency.py to make it reusable with artifact service
URL: https://github.com/apache/beam/pull/5251#issuecomment-385742122
 
 
   cc: @jkff 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97166)
Time Spent: 40m  (was: 0.5h)

> Python SDK stages artifacts when talking to job server
> --
>
> Key: BEAM-3883
> URL: https://issues.apache.org/jira/browse/BEAM-3883
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The Python SDK does not currently stage its user-defined functions or 
> dependencies when talking to the job API. Artifacts that need to be staged 
> include the user code itself, any SDK components not included in the 
> container image, and the list of Python packages that must be installed at 
> runtime.
>  
> Artifacts that are currently expected can be found in the harness boot code: 
> [https://github.com/apache/beam/blob/58e3b06bee7378d2d8db1c8dd534b415864f63e1/sdks/python/container/boot.go#L52.]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #247

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 18.45 MB...]
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.887Z: Fusing consumer 
PAssert$3/CreateActual/RewindowActuals/Window.Assign into 
PAssert$3/CreateActual/Flatten.Iterables/FlattenIterables/FlatMap
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.908Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Reify
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Partial
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.932Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.953Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Write
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Reify
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.974Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey+PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Partial
 into 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:28.998Z: Fusing consumer 
PAssert$3/CreateActual/ParDo(Anonymous) into 
PAssert$3/CreateActual/RewindowActuals/Window.Assign
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.019Z: Fusing consumer 
PAssert$3/CreateActual/View.AsSingleton/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into PAssert$3/CreateActual/ParDo(Anonymous)
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.040Z: Fusing consumer 
Combine.globally(Count)/ProduceDefault into 
Combine.globally(Count)/CreateVoid/Read(CreateSource)
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.051Z: Fusing consumer 
Combine.globally(Count)/View.AsIterable/ParDo(ToIsmRecordForGlobalWindow) into 
Combine.globally(Count)/Values/Values/Map
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.062Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues into 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Read
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.084Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Partial
 into Combine.globally(Count)/WithKeys/AddKeys/Map
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.105Z: Fusing consumer 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/GroupByWindow into 
DatastoreV1.Read/Reshuffle/Reshuffle/GroupByKey/Read
May 01, 2018 5:57:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-01T17:57:29.126Z: Fusing consumer 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey/Reify into 
Combine.globally(Count)/Combine.perKey(Count)/GroupByKey+Combine.globally(Count)/Combine.perKey(Count)/Combine.GroupedValues/Partial

[beam-site] 01/01: Prepare repository for deployment.

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit caae35d3898ca9c5a8768997d96efcb20112a77a
Author: Mergebot 
AuthorDate: Tue May 1 11:04:22 2018 -0700

Prepare repository for deployment.
---
 content/documentation/programming-guide/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index 8fb9cd6..2b4fa76 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -1705,7 +1705,7 @@ that apply to the function object itself also apply to 
this outer class.
 4.3.2. Thread-compatibility
 
 Your function object should be thread-compatible. Each instance of your 
function
-object is accessed by a single thread on a worker instance, unless you
+object is accessed by a single thread at a time on a worker instance, unless 
you
 explicitly create your own threads. Note, however, that the Beam SDKs 
are not
 thread-safe. If you create your own threads in your user code, you 
must
 provide your own synchronization. Note that static members in your function

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch asf-site updated (c770470 -> caae35d)

2018-05-01 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from c770470  Prepare repository for deployment.
 add 03b7de1  [BEAM-4177] Clarify thread contraint in Programming Guide 
4.3.2
 add f7972c2  This closes #430
 new caae35d  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/programming-guide/index.html | 2 +-
 src/documentation/programming-guide.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[jira] [Work logged] (BEAM-4177) Clarify thread constraints in Programming Guide 4.3.2

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4177?focusedWorklogId=97168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97168
 ]

ASF GitHub Bot logged work on BEAM-4177:


Author: ASF GitHub Bot
Created on: 01/May/18 18:04
Start Date: 01/May/18 18:04
Worklog Time Spent: 10m 
  Work Description: asfgit closed pull request #430: [BEAM-4177] Clarify 
thread contraint in Programming Guide 4.3.2
URL: https://github.com/apache/beam-site/pull/430
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 31db9bec2..b94fbeecf 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -1281,7 +1281,7 @@ Some other serializability factors you should keep in 
mind are:
  4.3.2. Thread-compatibility {#user-code-thread-compatibility}
 
 Your function object should be thread-compatible. Each instance of your 
function
-object is accessed by a single thread on a worker instance, unless you
+object is accessed by a single thread at a time on a worker instance, unless 
you
 explicitly create your own threads. Note, however, that **the Beam SDKs are not
 thread-safe**. If you create your own threads in your user code, you must
 provide your own synchronization. Note that static members in your function


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97168)
Time Spent: 1h 50m  (was: 1h 40m)

> Clarify thread constraints in Programming Guide 4.3.2
> -
>
> Key: BEAM-4177
> URL: https://issues.apache.org/jira/browse/BEAM-4177
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: John MacMillan
>Assignee: Melissa Pashniak
>Priority: Trivial
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In 
> [4.3.2|https://beam.apache.org/documentation/programming-guide/#user-code-thread-compatibility],
>  the sentence "Each instance of your function object is accessed by a single 
> thread on a worker instance, unless you explicitly create your own threads" 
> leaves some ambiguity about whether or not the instance is permanently tied 
> to a single thread, or just restricted to being active on a single thread at 
> a time.
> I suggest a minor change to:
> Each instance of your function object is accessed by a single thread at a 
> time on a worker instance, unless you explicitly create your own threads.
> I will prepare a pull request with this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-3983) BigQuery writes from pure SQL

2018-05-01 Thread Andrew Pilloud (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud closed BEAM-3983.

   Resolution: Fixed
Fix Version/s: 2.5.0

> BigQuery writes from pure SQL
> -
>
> Key: BEAM-3983
> URL: https://issues.apache.org/jira/browse/BEAM-3983
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> It would be nice if you could write to BigQuery in SQL without writing any 
> java code. For example:
> {code:java}
> INSERT INTO bigquery SELECT * FROM PCOLLECTION{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #248

2018-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 746.04 KB...]
Build cache key for task ':beam-sdks-java-io-amazon-web-services:findbugsMain' 
is e13415ca1148ba00d4dbfcd1be4508d1
Task ':beam-sdks-java-io-amazon-web-services:findbugsMain' is not up-to-date 
because:
  No history is available.
Starting process 'Gradle FindBugs Worker 17'. Working directory: 

 Command: /usr/local/asfpackages/java/jdk1.8.0_152/bin/java 
-Djava.security.manager=worker.org.gradle.process.internal.worker.child.BootstrapSecurityManager
 -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/caches/4.7/workerMain/gradle-worker.jar 
worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle FindBugs 
Worker 17'
Successfully started process 'Gradle FindBugs Worker 17'

> Task :beam-sdks-java-fn-execution:findbugsMain
Packing task ':beam-sdks-java-fn-execution:findbugsMain'
:beam-sdks-java-fn-execution:findbugsMain (Thread[Task worker for ':' Thread 
6,5,main]) completed. Took 9.028 secs.
:beam-sdks-java-fn-execution:test (Thread[Task worker for ':' Thread 6,5,main]) 
started.
Gradle Test Executor 18 started executing tests.

> Task :beam-sdks-java-fn-execution:test
Build cache key for task ':beam-sdks-java-fn-execution:test' is 
89d28dc601d4eb40cf374a0362f7bbf8
Task ':beam-sdks-java-fn-execution:test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 18'. Working directory: 

 Command: /usr/local/asfpackages/java/jdk1.8.0_152/bin/java 
-Djava.security.manager=worker.org.gradle.process.internal.worker.child.BootstrapSecurityManager
 -Dorg.gradle.native=false -Dfile.encoding=UTF-8 -Duser.country=US 
-Duser.language=en -Duser.variant -ea -cp 
/home/jenkins/.gradle/caches/4.7/workerMain/gradle-worker.jar 
worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test 
Executor 18'
Successfully started process 'Gradle Test Executor 18'

> Task :beam-sdks-java-io-amazon-web-services:findbugsMain
Packing task ':beam-sdks-java-io-amazon-web-services:findbugsMain'
:beam-sdks-java-io-amazon-web-services:findbugsMain (Thread[Daemon 
worker,5,main]) completed. Took 7.309 secs.
:beam-sdks-java-io-amazon-web-services:test (Thread[Daemon worker,5,main]) 
started.
Gradle Test Executor 19 started executing tests.

> Task :beam-sdks-java-fn-execution:test

org.apache.beam.sdk.fn.channel.ManagedChannelFactoryTest > 
testEpollDomainSocketChannel SKIPPED

org.apache.beam.sdk.fn.channel.ManagedChannelFactoryTest > 
testEpollHostPortChannel SKIPPED

Gradle Test Executor 19 finished executing tests.

> Task :beam-sdks-java-io-amazon-web-services:test
Build cache key for task ':beam-sdks-java-io-amazon-web-services:test' is 
f13f52ee552ff8d698bb23c76422241b
Task ':beam-sdks-java-io-amazon-web-services:test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 19'. Working directory: 

 Command: /usr/local/asfpackages/java/jdk1.8.0_152/bin/java 
-Djava.security.manager=worker.org.gradle.process.internal.worker.child.BootstrapSecurityManager
 -Dorg.gradle.native=false -Dfile.encoding=UTF-8 -Duser.country=US 
-Duser.language=en -Duser.variant -ea -cp 
/home/jenkins/.gradle/caches/4.7/workerMain/gradle-worker.jar 
worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test 
Executor 19'
Successfully started process 'Gradle Test Executor 19'

org.apache.beam.sdk.io.aws.s3.S3ResourceIdTest > testInvalidBucket SKIPPED

org.apache.beam.sdk.io.aws.s3.S3ResourceIdTest > 
testInvalidBucketWithUnderscore SKIPPED

org.apache.beam.sdk.io.aws.s3.S3ResourceIdTest > testResourceIdTester SKIPPED
Finished generating test XML results (0.001 secs) into: 

Generating HTML test report...
Finished generating test html results (0.004 secs) into: 

Packing task ':beam-sdks-java-io-amazon-web-services:test'
:beam-sdks-java-io-amazon-web-services:test (Thread[Daemon worker,5,main]) 
completed. Took 2.103 secs.
:beam-sdks-java-io-amazon-web-services:check (Thread[Daemon worker,5,main]) 
started.

> Task :beam-sdks-java-io-amazon-web-services:check
Skipping task ':beam-sdks-java-io-amazon-web-services:check' as it has no 
actions.
:beam-sdks-java-io-amazon-web-services:check (Thread[Daemon worker,5,main]) 
completed. Took 0.0 secs.
:beam-sdks-java-io-amazon-web-services:build (Thread[D

[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=97171&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97171
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 01/May/18 18:15
Start Date: 01/May/18 18:15
Worklog Time Spent: 10m 
  Work Description: apilloud opened a new pull request #5254: [BEAM-4044] 
[SQL] Simplify TableProvider interface
URL: https://github.com/apache/beam/pull/5254
 
 
   This should make it clear that tables are in memory. Also deleted some 
unused functions from the interface.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [X] Write a pull request description that is detailed enough to 
understand:
  - [X] What the pull request does
  - [X] Why it does it
  - [X] How it does it
  - [X] Why this approach
- [X] Each commit in the pull request should have a meaningful subject line 
and body.
- [X] Run `./gradlew build` to make sure basic checks pass. A more thorough 
check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97171)
Time Spent: 9h  (was: 8h 50m)

> Take advantage of Calcite DDL
> -
>
> Key: BEAM-4044
> URL: https://issues.apache.org/jira/browse/BEAM-4044
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> In Calcite 1.15 support for abstract DDL moved into calcite core. We should 
> take advantage of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1216

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 4.42 KB...]
  Using cached 
https://files.pythonhosted.org/packages/a2/71/8273a7eeed0aff6a854237ab5453bc9aa67deb49df4832801c21f0ff3782/contextlib2-0.5.5-py2.py3-none-any.whl
Collecting pywinrm (from -r PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/0d/12/13a3117bbd2230043aa32dcfa2198c33269665eaa1a8fa26174ce49b338f/pywinrm-0.3.0-py2.py3-none-any.whl
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14)) (1.11.0)
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15)) (1.0)
Collecting colorama; extra == "windows" (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
  Using cached 
https://files.pythonhosted.org/packages/db/c8/7dcf9dbcb22429512708fe3a547f8b6101c0d02137acbd892505aee57adf/colorama-0.3.9-py2.py3-none-any.whl
Collecting requests-ntlm>=0.3.0 (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/03/4b/8b9a1afde8072c4d5710d9fa91433d504325821b038e00237dc8d6d833dc/requests_ntlm-1.1.0-py2.py3-none-any.whl
Requirement already satisfied: requests>=2.9.1 in 
/usr/local/lib/python2.7/dist-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.18.4)
Collecting xmltodict (from pywinrm->-r PerfKitBenchmarker/requirements.txt 
(line 25))
  Using cached 
https://files.pythonhosted.org/packages/42/a9/7e99652c6bc619d19d58cdd8c47560730eb5825d43a7e25db2e1d776ceb7/xmltodict-0.11.0-py2.py3-none-any.whl
Requirement already satisfied: cryptography>=1.3 in 
/usr/local/lib/python2.7/dist-packages (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.2.2)
Collecting ntlm-auth>=1.0.2 (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached 
https://files.pythonhosted.org/packages/69/bc/230987c0dc22c763529330b2e669dbdba374d6a10c1f61232274184731be/ntlm_auth-1.1.0-py2.py3-none-any.whl
Requirement already satisfied: certifi>=2017.4.17 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2018.4.16)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (3.0.4)
Requirement already satisfied: idna<2.7,>=2.5 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.6)
Requirement already satisfied: urllib3<1.23,>=1.21.1 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.22)
Requirement already satisfied: cffi>=1.7; platform_python_implementation != 
"PyPy" in /usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.11.5)
Requirement already satisfied: enum34; python_version < "3" in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.1.6)
Requirement already satisfied: asn1crypto>=0.21.0 in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (0.24.0)
Requirement already satisfied: ipaddress; python_version < "3" in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (1.0.22)
Requirement already satisfied: pycparser in 
/usr/local/lib/python2.7/dist-packages (from cffi>=1.7; 
platform_python_implementation != 
"PyPy"->cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25)) (2.18)
Installing collected packages: absl-py, colorama, colorlog, blinker, futures, 
pint, numpy, contextlib2, ntlm-auth, requests-ntlm, xmltodict, pywinrm
Successfully installed absl-py-0.2.0 blinker-1.4 colorama-0.3.9 colorlog-2.6.0 
contextlib2-0.5.5 futures-3.2.0 ntlm-auth-1.1.0 numpy-1.13.3 pint-0.8.1 
pywinrm-0.3.0 requests-ntlm-1.1.0 xmltodict-0.11.0
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3393256317851533590.sh
+ .env/bin/pip install -e 'src/sdks/python/[gcp,test]'
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
Requirement already satisfied: crcmod<2.0,>=1

[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=97172&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97172
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 01/May/18 18:17
Start Date: 01/May/18 18:17
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #5254: [BEAM-4044] [SQL] 
Simplify TableProvider interface
URL: https://github.com/apache/beam/pull/5254#issuecomment-385746088
 
 
   R: @kennknowles This is some cleanup you requested in #5220. Should make it 
clear why dropTable is a no-op.
   cc: @akedin 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97172)
Time Spent: 9h 10m  (was: 9h)

> Take advantage of Calcite DDL
> -
>
> Key: BEAM-4044
> URL: https://issues.apache.org/jira/browse/BEAM-4044
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> In Calcite 1.15 support for abstract DDL moved into calcite core. We should 
> take advantage of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_JDBC #520

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 921.54 KB...]
[INFO] Excluding io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 from the 
shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-core:jar:3.1.2 from the shaded 
jar.
[INFO] Excluding com.google.protobuf:protobuf-java:jar:3.2.0 from the shaded 
jar.
[INFO] Excluding io.netty:netty-tcnative-boringssl-static:jar:1.1.33.Fork26 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.23.0 from the 
shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client:jar:1.23.0 from 
the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.23.0 from the 
shaded jar.
[INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding commons-codec:commons-codec:jar:1.3 from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.23.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-dataflow:jar:v1b3-rev221-1.23.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-clouddebugger:jar:v2-rev233-1.23.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev124-1.23.0 from the 
shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.23.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.23.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.23.0 
from the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-failsafe-plugin:2.21.0:integration-test (default) @ 
beam-sdks-java-io-jdbc ---
[INFO] Failsafe report directory: 

[INFO] parallel='all', perCoreThreadCount=true, threadCount=4, 
useUnlimitedThreads=false, threadCountSuites=0, threadCountClasses=0, 
threadCountMethods=0, parallelOptimized=true
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0 s <<< 
FAILURE! - in org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] org.apache.beam.sdk.io.jdbc.JdbcIOIT  Time elapsed: 0 s  <<< ERROR!
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:272)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.(PgConnection.java:215)
at org.postgresql.Driver.makeConnection(Driver.java:404)
at org.postgresql.Driver.connect(Driver.java:272)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:86)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:71)
at 
org.apache.beam.sdk.io.comm

[jira] [Created] (BEAM-4218) Javadoc build failing after beam/pull/5181

2018-05-01 Thread Alan Myrvold (JIRA)
Alan Myrvold created BEAM-4218:
--

 Summary: Javadoc build failing after beam/pull/5181
 Key: BEAM-4218
 URL: https://issues.apache.org/jira/browse/BEAM-4218
 Project: Beam
  Issue Type: Bug
  Components: runner-direct
Affects Versions: 2.5.0
Reporter: Alan Myrvold
Assignee: Alan Myrvold
 Fix For: 2.5.0


The javadoc build is failing after beam/pull/5181
/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/runners/local-java/src/main/java/org/apache/beam/runners/local/Bundle.java:53:
 error: reference not found
   * processing time \{@link TimerData timer} at the time this bundle was 
committed, including any



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4057) Ensure generated pom don't break consumers

2018-05-01 Thread Scott Wegner (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459977#comment-16459977
 ] 

Scott Wegner commented on BEAM-4057:


It seems like high-order item here is "Ensure Gradle can produce a valid 
release".

Let's keep this open until we've validated the 2.5.0 release. [~kenn] / 
[~romain.manni-bucau], do we have a standard release validation set? For 2.3.0 
I see 
[https://s.apache.org/beam-2.3.0-release-validation|https://s.apache.org/beam-2.3.0-release-validation.]
 If there are additional items to validate, let's propose they are added to the 
standard release validation so we ensure we uphold them for every release.

> Ensure generated pom don't break consumers
> --
>
> Key: BEAM-4057
> URL: https://issues.apache.org/jira/browse/BEAM-4057
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> Out of my head here are the requirements:
> 1. dependencies are all here (all scopes and well scoped: this means that 
> provided or test dependencies are not in compile scope for instance)
> 2. META-INF should contain the pom.xml and pom.properties as maven generates 
> them (it is consumes by tools and libraries to grab the dependencies or scan 
> some classpath/lib folder)
> 3. ensure the compiler plugin at least is defined with the java 
> version+compiler flags (a usage is to check if -parameters is activated for 
> instance)
> 4. (nice to have) dont put all the boilerplate in all poms (license, etc) but 
> keep it in the parent pom as it was
> 5. (if possible) respect the hierarchy (parents) - this is used sometimes as 
> a shortcut for dependencies analyzis cause it is faster than analyzing the 
> dependencies, probably not the best practise ever but it is efficient in 
> general
> 6. ensure meta used by mainstream tools like mvnrepository are here 
> (description etc, should be a passthrough from gradle)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4218) Javadoc build failing after beam/pull/5181

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4218?focusedWorklogId=97174&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97174
 ]

ASF GitHub Bot logged work on BEAM-4218:


Author: ASF GitHub Bot
Created on: 01/May/18 18:22
Start Date: 01/May/18 18:22
Worklog Time Spent: 10m 
  Work Description: alanmyrvold opened a new pull request #5255: 
[BEAM-4218] Fix failing javadoc build
URL: https://github.com/apache/beam/pull/5255
 
 
   Fix failing javadoc build
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `./gradlew build` to make sure basic checks pass. A more thorough 
check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97174)
Time Spent: 10m
Remaining Estimate: 0h

> Javadoc build failing after beam/pull/5181
> --
>
> Key: BEAM-4218
> URL: https://issues.apache.org/jira/browse/BEAM-4218
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.5.0
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The javadoc build is failing after beam/pull/5181
> /home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/runners/local-java/src/main/java/org/apache/beam/runners/local/Bundle.java:53:
>  error: reference not found
>* processing time \{@link TimerData timer} at the time this bundle was 
> committed, including any



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=97176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97176
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 01/May/18 18:23
Start Date: 01/May/18 18:23
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #5254: 
[BEAM-4044] [SQL] Simplify TableProvider interface
URL: https://github.com/apache/beam/pull/5254#discussion_r185294687
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTableProvider.java
 ##
 @@ -41,7 +39,7 @@
  * LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
  * }
  */
-public class BigQueryTableProvider implements TableProvider {
+public class BigQueryTableProvider extends InMemoryTableProvider {
 
 Review comment:
   it reads as if BigQuery table (with all data) is going to be in-memory, is 
it the case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97176)
Time Spent: 9h 20m  (was: 9h 10m)

> Take advantage of Calcite DDL
> -
>
> Key: BEAM-4044
> URL: https://issues.apache.org/jira/browse/BEAM-4044
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> In Calcite 1.15 support for abstract DDL moved into calcite core. We should 
> take advantage of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4218) Javadoc build failing after beam/pull/5181

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4218?focusedWorklogId=97175&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97175
 ]

ASF GitHub Bot logged work on BEAM-4218:


Author: ASF GitHub Bot
Created on: 01/May/18 18:23
Start Date: 01/May/18 18:23
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5255: [BEAM-4218] Fix 
failing javadoc build
URL: https://github.com/apache/beam/pull/5255#issuecomment-385747601
 
 
   +R: @tgroh  PTAL?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97175)
Time Spent: 20m  (was: 10m)

> Javadoc build failing after beam/pull/5181
> --
>
> Key: BEAM-4218
> URL: https://issues.apache.org/jira/browse/BEAM-4218
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.5.0
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The javadoc build is failing after beam/pull/5181
> /home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/runners/local-java/src/main/java/org/apache/beam/runners/local/Bundle.java:53:
>  error: reference not found
>* processing time \{@link TimerData timer} at the time this bundle was 
> committed, including any



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #118

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 899.29 KB...]
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=104.198.180.190:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=104.198.180.190:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches WritableServerSelector. Client view of cluster state is 
{type=UNKNOWN, servers=[{address=104.198.180.190:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT_HDFS #117

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 968.38 KB...]
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy62.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy63.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:118)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.u

Build failed in Jenkins: beam_PerformanceTests_TextIOIT_HDFS #123

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Mark some ReduceFnRunner arguments Nullable

[Pablo] Add documentaton for quickstart tasks

--
[...truncated 1.02 MB...]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy62.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy63.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:249)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:236)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn

[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97179&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97179
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 18:37
Start Date: 01/May/18 18:37
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185298381
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerWriteResult.java
 ##
 @@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import com.google.cloud.spanner.Mutation;
+import com.google.common.collect.ImmutableMap;
+import java.util.Map;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PInput;
+import org.apache.beam.sdk.values.POutput;
+import org.apache.beam.sdk.values.PValue;
+import org.apache.beam.sdk.values.TupleTag;
+
+/**
+ * A result of {@link SpannerIO#write()} transform. Use {@link 
#getFailedMutations} to access
+ * failed Mutations.
+ */
+public class SpannerWriteResult implements POutput {
+  private final Pipeline pipeline;
+  private final PCollection output;
+  private final PCollection failedMutations;
+
+  public SpannerWriteResult(Pipeline pipeline, PCollection output,
+  PCollection failedMutations) {
+this.pipeline = pipeline;
+this.output = output;
+this.failedMutations = failedMutations;
+  }
+
+  @Override
+  public Pipeline getPipeline() {
+return pipeline;
+  }
+
+  @Override
+  public Map, PValue> expand() {
+return ImmutableMap.of(new TupleTag("failedMutation"), 
failedMutations);
+  }
+
+  public PCollection getFailedMutations() {
+return failedMutations;
+  }
+
+  public PCollection getOutput() {
+return output;
+  }
+
+  @Override
+  public void finishSpecifyingOutput(String transformName, PInput input,
 
 Review comment:
   it's defined in `POutput`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97179)
Time Spent: 1h  (was: 50m)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4219) Make Portable ULR use workerId in GRPC channel

2018-05-01 Thread Ankur Goenka (JIRA)
Ankur Goenka created BEAM-4219:
--

 Summary: Make Portable ULR use workerId in GRPC channel
 Key: BEAM-4219
 URL: https://issues.apache.org/jira/browse/BEAM-4219
 Project: Beam
  Issue Type: Bug
  Components: runner-core
Reporter: Ankur Goenka
Assignee: Ankur Goenka


The internal implementation of Data service uses workerId make sure that 

[https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/GrpcDataService.java]

Also uses workerId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4219) Make Portable ULR use workerId in GRPC channel

2018-05-01 Thread Ankur Goenka (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460003#comment-16460003
 ] 

Ankur Goenka commented on BEAM-4219:


cc: [~lcwik]

> Make Portable ULR use workerId in GRPC channel
> --
>
> Key: BEAM-4219
> URL: https://issues.apache.org/jira/browse/BEAM-4219
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>
> The internal implementation of Data service uses workerId make sure that 
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/GrpcDataService.java]
> Also uses workerId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=97181&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97181
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 01/May/18 18:47
Start Date: 01/May/18 18:47
Worklog Time Spent: 10m 
  Work Description: apilloud commented on a change in pull request #5254: 
[BEAM-4044] [SQL] Simplify TableProvider interface
URL: https://github.com/apache/beam/pull/5254#discussion_r185300879
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTableProvider.java
 ##
 @@ -41,7 +39,7 @@
  * LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
  * }
  */
-public class BigQueryTableProvider implements TableProvider {
+public class BigQueryTableProvider extends InMemoryTableProvider {
 
 Review comment:
   It is going to be in-memory in the sense that the metadata is in-memory 
only. I'm happy to change the name to something less confusing. Do you have any 
suggestions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97181)
Time Spent: 9.5h  (was: 9h 20m)

> Take advantage of Calcite DDL
> -
>
> Key: BEAM-4044
> URL: https://issues.apache.org/jira/browse/BEAM-4044
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> In Calcite 1.15 support for abstract DDL moved into calcite core. We should 
> take advantage of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97191
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:08
Start Date: 01/May/18 19:08
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185306314
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerWriteResult.java
 ##
 @@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import com.google.cloud.spanner.Mutation;
+import com.google.common.collect.ImmutableMap;
+import java.util.Map;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PInput;
+import org.apache.beam.sdk.values.POutput;
+import org.apache.beam.sdk.values.PValue;
+import org.apache.beam.sdk.values.TupleTag;
+
+/**
+ * A result of {@link SpannerIO#write()} transform. Use {@link 
#getFailedMutations} to access
+ * failed Mutations.
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97191)
Time Spent: 1h 10m  (was: 1h)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4215) Update Python SDK dependencies for 2.5.0 release

2018-05-01 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460031#comment-16460031
 ] 

Valentyn Tymofieiev commented on BEAM-4215:
---

[https://pypi.org/project/oauth2client/] says oauth2client is now deprecated. 
No more features will be added to the libraries and the core team is turning 
down support. We recommend you use 
[google-auth|https://google-auth.readthedocs.io/] and 
[oauthlib|http://oauthlib.readthedocs.io/]. 
  
 That said, I think we should remove oauth2client as beam Beam dependency here: 
[https://github.com/apache/beam/blob/92fd475afca09da7da1224775342bd668b53d83a/sdks/python/setup.py#L103]
  
 Then, GCP dependencies 
([https://github.com/apache/beam/blob/92fd475afca09da7da1224775342bd668b53d83a/sdks/python/setup.py#L119)]
 will pick up and appropriate version of OAuth client for the dependencies.  
  
 As things are right now, the Beam user experience will not be very nice for 
Pip 10 users, since after installing Apache Beam GCP, every pip operation 
results in warnings:
{panel}
gapic-google-cloud-pubsub-v1 0.15.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.2 which is 
incompatible.
 googledatastore 7.0.1 has requirement oauth2client<4.0.0,>=2.0.1, but you'll 
have oauth2client 4.1.2 which is incompatible.
 proto-google-cloud-pubsub-v1 0.15.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.2 which is 
incompatible.
 proto-google-cloud-datastore-v1 0.90.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.2 which is 
incompatible.
{panel}
 

> Update Python SDK dependencies for 2.5.0 release
> 
>
> Key: BEAM-4215
> URL: https://issues.apache.org/jira/browse/BEAM-4215
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.5.0
>Reporter: Valentyn Tymofieiev
>Assignee: Ahmet Altay
>Priority: Major
> Fix For: 2.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97192&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97192
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:08
Start Date: 01/May/18 19:08
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185306335
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerWriteResult.java
 ##
 @@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import com.google.cloud.spanner.Mutation;
+import com.google.common.collect.ImmutableMap;
+import java.util.Map;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PInput;
+import org.apache.beam.sdk.values.POutput;
+import org.apache.beam.sdk.values.PValue;
+import org.apache.beam.sdk.values.TupleTag;
+
+/**
+ * A result of {@link SpannerIO#write()} transform. Use {@link 
#getFailedMutations} to access
+ * failed Mutations.
+ */
+public class SpannerWriteResult implements POutput {
+  private final Pipeline pipeline;
+  private final PCollection output;
+  private final PCollection failedMutations;
+
+  public SpannerWriteResult(Pipeline pipeline, PCollection output,
+  PCollection failedMutations) {
+this.pipeline = pipeline;
+this.output = output;
+this.failedMutations = failedMutations;
+  }
+
+  @Override
+  public Pipeline getPipeline() {
+return pipeline;
+  }
+
+  @Override
+  public Map, PValue> expand() {
+return ImmutableMap.of(new TupleTag("failedMutation"), 
failedMutations);
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97192)
Time Spent: 1h 20m  (was: 1h 10m)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97194
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:09
Start Date: 01/May/18 19:09
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185306589
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerWriteIT.java
 ##
 @@ -128,19 +140,72 @@ public void testWrite() throws Exception {
 .withInstanceId(options.getInstanceId())
 .withDatabaseId(databaseName));
 
-p.run();
-DatabaseClient databaseClient =
-spanner.getDatabaseClient(
-DatabaseId.of(
-project, options.getInstanceId(), databaseName));
+PipelineResult result = p.run();
+result.waitUntilFinish();
+assertThat(result.getState(), is(PipelineResult.State.DONE));
+assertThat(countNumberOfRecords(), equalTo((long) numRecords));
 
 Review comment:
   Isn't it better for the test to compute that without going through beam? 
`testSequentialWrite` already tests `Wait.on`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97194)
Time Spent: 1.5h  (was: 1h 20m)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-05-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reassigned BEAM-3714:
--

Assignee: Innocent  (was: Jean-Baptiste Onofré)

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Innocent
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-05-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460042#comment-16460042
 ] 

Jean-Baptiste Onofré commented on BEAM-3714:


Just a mistake from my side ;) Sorry about that.

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3042) Add tracking of bytes read / time spent when reading side inputs

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3042?focusedWorklogId=97206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97206
 ]

ASF GitHub Bot logged work on BEAM-3042:


Author: ASF GitHub Bot
Created on: 01/May/18 19:46
Start Date: 01/May/18 19:46
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #5075: [BEAM-3042] Refactor 
of TransformIOCounters (performance, inheritance).
URL: https://github.com/apache/beam/pull/5075#issuecomment-385769586
 
 
   Confirmed that there is no regression. Also updated PR. Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97206)
Time Spent: 3h 20m  (was: 3h 10m)

> Add tracking of bytes read / time spent when reading side inputs
> 
>
> Key: BEAM-3042
> URL: https://issues.apache.org/jira/browse/BEAM-3042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> It is difficult for Dataflow users to understand how modifying a pipeline or 
> data set can affect how much inter-transform IO is used in their job. The 
> intent of this feature request is to help users understand how side inputs 
> behave when they are consumed.
> This will allow users to understand how much time and how much data their 
> pipeline uses to read/write to inter-transform IO. Users will also be able to 
> modify their pipelines and understand how their changes affect these IO 
> metrics.
> For further information, please review the internal Google doc 
> go/insights-transform-io-design-doc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: [BEAM-3042] Refactor of TransformIOCounters (performance, inheritance). (#5075)

2018-05-01 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new a0036d5  [BEAM-3042] Refactor of TransformIOCounters (performance, 
inheritance). (#5075)
a0036d5 is described below

commit a0036d5ab77be31ed090b3338faf2b32784399e0
Author: Pablo 
AuthorDate: Tue May 1 12:47:03 2018 -0700

[BEAM-3042] Refactor of TransformIOCounters (performance, inheritance). 
(#5075)

* Refactor of TransformIOCounters (performance, inheritance).
---
 .../apache_beam/runners/worker/opcounters.pxd  | 15 ++--
 .../apache_beam/runners/worker/opcounters.py   | 90 +-
 .../apache_beam/runners/worker/operations.py   |  2 +-
 .../apache_beam/runners/worker/sideinputs.py   |  7 +-
 4 files changed, 69 insertions(+), 45 deletions(-)

diff --git a/sdks/python/apache_beam/runners/worker/opcounters.pxd 
b/sdks/python/apache_beam/runners/worker/opcounters.pxd
index 40ca72d..0bcd428 100644
--- a/sdks/python/apache_beam/runners/worker/opcounters.pxd
+++ b/sdks/python/apache_beam/runners/worker/opcounters.pxd
@@ -22,21 +22,26 @@ from apache_beam.utils.counters cimport Counter
 
 
 cdef class TransformIOCounter(object):
+  cdef readonly object _counter_factory
+  cdef readonly object _state_sampler
+  cdef Counter bytes_read_counter
+  cdef object scoped_state
+  cdef object _latest_step
+
   cpdef update_current_step(self)
   cpdef add_bytes_read(self, libc.stdint.int64_t n)
   cpdef __enter__(self)
   cpdef __exit__(self, exc_type, exc_value, traceback)
 
 
+cdef class NoOpTransformIOCounter(TransformIOCounter):
+  pass
+
+
 cdef class SideInputReadCounter(TransformIOCounter):
-  cdef readonly object _counter_factory
-  cdef readonly object _state_sampler
   cdef readonly object declaring_step
   cdef readonly object input_index
 
-  cdef Counter bytes_read_counter
-  cdef object scoped_state
-
 
 cdef class SumAccumulator(object):
   cdef libc.stdint.int64_t _value
diff --git a/sdks/python/apache_beam/runners/worker/opcounters.py 
b/sdks/python/apache_beam/runners/worker/opcounters.py
index 17fead2..0e4ee0a 100644
--- a/sdks/python/apache_beam/runners/worker/opcounters.py
+++ b/sdks/python/apache_beam/runners/worker/opcounters.py
@@ -41,24 +41,66 @@ class TransformIOCounter(object):
   Some examples of IO can be side inputs, shuffle, or streaming state.
   """
 
+  def __init__(self, counter_factory, state_sampler):
+"""Create a new IO read counter.
+
+Args:
+  counter_factory: A counters.CounterFactory to create byte counters.
+  state_sampler: A statesampler.StateSampler to transition into read 
states.
+"""
+self._counter_factory = counter_factory
+self._state_sampler = state_sampler
+self._latest_step = None
+self.bytes_read_counter = None
+self.scoped_state = None
+
   def update_current_step(self):
-"""Update the current step within a stage as it may have changed.
+"""Update the current running step.
 
-If the state changed, it would mean that an initial step passed a
-data-accessor (such as a side input / shuffle Iterable) down to the
-next step in a stage.
+Due to the fusion optimization, user code may choose to emit the data
+structure that holds side inputs (Iterable, Dict, or others). This call
+updates the current step, to attribute the data consumption to the step
+that is responsible for actual consumption.
+
+CounterName uses the io_target field for information pertinent to the
+consumption of IO.
 """
+current_state = self._state_sampler.current_state()
+current_step_name = current_state.name.step_name
+if current_step_name != self._latest_step:
+  self._latest_step = current_step_name
+  self._update_counters_for_requesting_step(current_step_name)
+
+  def _update_counters_for_requesting_step(self, step_name):
 pass
 
   def add_bytes_read(self, count):
-pass
+if count > 0 and self.bytes_read_counter:
+  self.bytes_read_counter.update(count)
+
+  def __enter__(self):
+self.scoped_state.__enter__()
+
+  def __exit__(self, exception_type, exception_value, traceback):
+self.scoped_state.__exit__(exception_type, exception_value, traceback)
+
+
+class NoOpTransformIOCounter(TransformIOCounter):
+  """All operations for IO tracking are no-ops."""
+
+  def __init__(self):
+super(NoOpTransformIOCounter, self).__init__(None, None)
 
-  def __exit__(self, exc_type, exc_value, traceback):
-"""Exit the IO state."""
+  def update_current_step(self):
 pass
 
   def __enter__(self):
-"""Enter the IO state. This should track time spent blocked on IO."""
+pass
+
+  def __exit__(self, exception_type, exception_value, traceback):
+pass
+
+  def add_bytes_read(self, count):
 pass
 
 
@@ -93,8 +135,7 @@ class SideInputReadCounter(TransformIOCounter):
 s

[jira] [Work logged] (BEAM-3042) Add tracking of bytes read / time spent when reading side inputs

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3042?focusedWorklogId=97207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97207
 ]

ASF GitHub Bot logged work on BEAM-3042:


Author: ASF GitHub Bot
Created on: 01/May/18 19:47
Start Date: 01/May/18 19:47
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #5075: [BEAM-3042] Refactor 
of TransformIOCounters (performance, inheritance).
URL: https://github.com/apache/beam/pull/5075
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/runners/worker/opcounters.pxd 
b/sdks/python/apache_beam/runners/worker/opcounters.pxd
index 40ca72dde7e..0bcd42848d2 100644
--- a/sdks/python/apache_beam/runners/worker/opcounters.pxd
+++ b/sdks/python/apache_beam/runners/worker/opcounters.pxd
@@ -22,21 +22,26 @@ from apache_beam.utils.counters cimport Counter
 
 
 cdef class TransformIOCounter(object):
+  cdef readonly object _counter_factory
+  cdef readonly object _state_sampler
+  cdef Counter bytes_read_counter
+  cdef object scoped_state
+  cdef object _latest_step
+
   cpdef update_current_step(self)
   cpdef add_bytes_read(self, libc.stdint.int64_t n)
   cpdef __enter__(self)
   cpdef __exit__(self, exc_type, exc_value, traceback)
 
 
+cdef class NoOpTransformIOCounter(TransformIOCounter):
+  pass
+
+
 cdef class SideInputReadCounter(TransformIOCounter):
-  cdef readonly object _counter_factory
-  cdef readonly object _state_sampler
   cdef readonly object declaring_step
   cdef readonly object input_index
 
-  cdef Counter bytes_read_counter
-  cdef object scoped_state
-
 
 cdef class SumAccumulator(object):
   cdef libc.stdint.int64_t _value
diff --git a/sdks/python/apache_beam/runners/worker/opcounters.py 
b/sdks/python/apache_beam/runners/worker/opcounters.py
index 17fead28ed2..0e4ee0a05dc 100644
--- a/sdks/python/apache_beam/runners/worker/opcounters.py
+++ b/sdks/python/apache_beam/runners/worker/opcounters.py
@@ -41,24 +41,66 @@ class TransformIOCounter(object):
   Some examples of IO can be side inputs, shuffle, or streaming state.
   """
 
+  def __init__(self, counter_factory, state_sampler):
+"""Create a new IO read counter.
+
+Args:
+  counter_factory: A counters.CounterFactory to create byte counters.
+  state_sampler: A statesampler.StateSampler to transition into read 
states.
+"""
+self._counter_factory = counter_factory
+self._state_sampler = state_sampler
+self._latest_step = None
+self.bytes_read_counter = None
+self.scoped_state = None
+
   def update_current_step(self):
-"""Update the current step within a stage as it may have changed.
+"""Update the current running step.
 
-If the state changed, it would mean that an initial step passed a
-data-accessor (such as a side input / shuffle Iterable) down to the
-next step in a stage.
+Due to the fusion optimization, user code may choose to emit the data
+structure that holds side inputs (Iterable, Dict, or others). This call
+updates the current step, to attribute the data consumption to the step
+that is responsible for actual consumption.
+
+CounterName uses the io_target field for information pertinent to the
+consumption of IO.
 """
+current_state = self._state_sampler.current_state()
+current_step_name = current_state.name.step_name
+if current_step_name != self._latest_step:
+  self._latest_step = current_step_name
+  self._update_counters_for_requesting_step(current_step_name)
+
+  def _update_counters_for_requesting_step(self, step_name):
 pass
 
   def add_bytes_read(self, count):
-pass
+if count > 0 and self.bytes_read_counter:
+  self.bytes_read_counter.update(count)
+
+  def __enter__(self):
+self.scoped_state.__enter__()
+
+  def __exit__(self, exception_type, exception_value, traceback):
+self.scoped_state.__exit__(exception_type, exception_value, traceback)
+
+
+class NoOpTransformIOCounter(TransformIOCounter):
+  """All operations for IO tracking are no-ops."""
+
+  def __init__(self):
+super(NoOpTransformIOCounter, self).__init__(None, None)
 
-  def __exit__(self, exc_type, exc_value, traceback):
-"""Exit the IO state."""
+  def update_current_step(self):
 pass
 
   def __enter__(self):
-"""Enter the IO state. This should track time spent blocked on IO."""
+pass
+
+  def __exit__(self, exception_type, exception_value, traceback):
+pass
+
+  def add_bytes_read(self, count):
 pass
 
 
@@ -93,8 +135,7 @@ def __init__(self, counter_factory, state_sampler, 
declaring_step,
 side input, and input_index is the index of the PCollectionV

[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97208
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:47
Start Date: 01/May/18 19:47
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4264: [BEAM-4061] Introduced 
SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#issuecomment-385769771
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97208)
Time Spent: 1h 40m  (was: 1.5h)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97210&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97210
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:54
Start Date: 01/May/18 19:54
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185315783
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
 ##
 @@ -777,21 +797,23 @@ public PDone expand(PCollection input) {
   .apply("Sample keys", sampler)
   .apply("Keys sample as view", View.asMap());
 
+  TupleTag mainTag = new TupleTag<>("mainOut");
+  TupleTag failedTag = new TupleTag<>("failedMutations");
   // Assign partition based on the closest element in the sample and group 
mutations.
   AssignPartitionFn assignPartitionFn = new AssignPartitionFn(keySample);
-  serialized
+  PCollectionTuple result = serialized
   .apply("Partition input", 
ParDo.of(assignPartitionFn).withSideInputs(keySample))
   .setCoder(KvCoder.of(StringUtf8Coder.of(), 
SerializedMutationCoder.of()))
-  .apply("Group by partition", GroupByKey.create())
-  .apply(
-  "Batch mutations together",
+  .apply("Group by partition", GroupByKey.create()).apply("Batch 
mutations together",
   ParDo.of(new BatchFn(spec.getBatchSizeBytes(), 
spec.getSpannerConfig(), schemaView))
-  .withSideInputs(schemaView))
-  .apply(
-  "Write mutations to Spanner",
-  ParDo.of(new WriteToSpannerFn(spec.getSpannerConfig(;
-  return PDone.in(input.getPipeline());
-
+  .withSideInputs(schemaView)).apply("Write mutations to 
Spanner",
+  ParDo.of(new WriteToSpannerFn(spec.getSpannerConfig(), 
spec.getFailureMode(),
+  failedTag))
+  .withOutputTags(mainTag, TupleTagList.of(failedTag)));
+  PCollection failedMutations = result.get(failedTag);
+  failedMutations.setCoder(SerializableCoder.of(MutationGroup.class));
 
 Review comment:
   It's unfortunate that we can't directly use MutationGroupEncoder as a Coder, 
since it requires a Schema known at construction time...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97210)
Time Spent: 1h 50m  (was: 1h 40m)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97211&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97211
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 19:54
Start Date: 01/May/18 19:54
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #4264: 
[BEAM-4061] Introduced SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#discussion_r185316268
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerWriteIT.java
 ##
 @@ -128,19 +140,72 @@ public void testWrite() throws Exception {
 .withInstanceId(options.getInstanceId())
 .withDatabaseId(databaseName));
 
-p.run();
-DatabaseClient databaseClient =
-spanner.getDatabaseClient(
-DatabaseId.of(
-project, options.getInstanceId(), databaseName));
+PipelineResult result = p.run();
+result.waitUntilFinish();
+assertThat(result.getState(), is(PipelineResult.State.DONE));
+assertThat(countNumberOfRecords(), equalTo((long) numRecords));
 
 Review comment:
   In general, "write-then-read" tests are preferred in Beam, where both write 
and read are done using your transform. They can be done in separate pipelines 
in one test method, or in one pipeline. I'm OK keeping this as-is for this PR 
though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97211)
Time Spent: 2h  (was: 1h 50m)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1496

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[github] [BEAM-3042] Refactor of TransformIOCounters (performance, inheritance).

--
Started by GitHub push by pabloem
[EnvInject] - Loading node environment variables.
Building remotely on beam23 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision a0036d5ab77be31ed090b3338faf2b32784399e0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a0036d5ab77be31ed090b3338faf2b32784399e0
Commit message: "[BEAM-3042] Refactor of TransformIOCounters (performance, 
inheritance). (#5075)"
 > git rev-list --no-walk 92fd475afca09da7da1224775342bd668b53d83a # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins2566173868859748733.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in /usr/lib/python2.7/dist-packages 
(15.0.1)

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
sdks/python/run_validatesrunner.sh: line 38: 
/home/jenkins/.local/bin//virtualenv: No such file or directory
Build step 'Execute shell' marked build as failure
Not sending mail to unregistered user sweg...@google.com
Not sending mail to unregistered user pger...@us.ibm.com
Not sending mail to unregistered user aal...@gmail.com
Not sending mail to unregistered user sid...@google.com
Not sending mail to unregistered user katarzyna.kucharc...@polidea.com
Not sending mail to unregistered user ankurgoe...@gmail.com
Not sending mail to unregistered user hero...@google.com
Not sending mail to unregistered user ro...@frantil.com
Not sending mail to unregistered user w...@google.com
Not sending mail to unregistered user szewi...@gmail.com
Not sending mail to unregistered user git...@alasdairhodge.co.uk
Not sending mail to unregistered user ke...@google.com
Not sending mail to unregistered user ekirpic...@gmail.com
Not sending mail to unregistered user aljoscha.kret...@gmail.com
Not sending mail to unregistered user apill...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro2.roam.corp.google.com
Not sending mail to unregistered user kirpic...@google.com


[jira] [Work logged] (BEAM-4061) Chaining SpannerIO#write() transforms

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4061?focusedWorklogId=97218&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97218
 ]

ASF GitHub Bot logged work on BEAM-4061:


Author: ASF GitHub Bot
Created on: 01/May/18 20:31
Start Date: 01/May/18 20:31
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4264: [BEAM-4061] Introduced 
SpannerWriteResult
URL: https://github.com/apache/beam/pull/4264#issuecomment-385780951
 
 
   There seems to be a checkstyle violation that's legit, could you fix it? But 
also a bunch of Java compiler OOMs, not sure why is that, will ping dev@.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97218)
Time Spent: 2h 10m  (was: 2h)

> Chaining SpannerIO#write() transforms
> -
>
> Key: BEAM-4061
> URL: https://issues.apache.org/jira/browse/BEAM-4061
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> It should be possible to chain several Cloud Spanner writes. In practice, we 
> can leverage Wait.on transform by returning a result object from 
> SpannerIO#write.
> One particular example, when this feature is useful is full database import. 
> When data in parent tables should be injected before interleaved tables. See 
> more about table hierarchies in Spanner here 
> https://cloud.google.com/spanner/docs/schema-and-data-model#creating_a_hierarchy_of_interleaved_tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4218) Javadoc build failing after beam/pull/5181

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4218?focusedWorklogId=97219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97219
 ]

ASF GitHub Bot logged work on BEAM-4218:


Author: ASF GitHub Bot
Created on: 01/May/18 20:33
Start Date: 01/May/18 20:33
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5255: 
[BEAM-4218] Fix failing javadoc build
URL: https://github.com/apache/beam/pull/5255#discussion_r185326239
 
 

 ##
 File path: 
runners/local-java/src/main/java/org/apache/beam/runners/local/Bundle.java
 ##
 @@ -50,8 +50,8 @@
* past this point before consuming this bundle.
*
* This value is no greater than the earliest incomplete processing time 
or synchronized
-   * processing time {@link TimerData timer} at the time this bundle was 
committed, including any
-   * timers that fired to produce this bundle.
+   * processing time {@link 
org.apache.beam.runners.core.TimerInternals.TimerData timer} at the time
 
 Review comment:
   I would just remove the dependency and the `link` instead of adding a 
dependency for javadoc only


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97219)
Time Spent: 0.5h  (was: 20m)

> Javadoc build failing after beam/pull/5181
> --
>
> Key: BEAM-4218
> URL: https://issues.apache.org/jira/browse/BEAM-4218
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.5.0
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The javadoc build is failing after beam/pull/5181
> /home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/runners/local-java/src/main/java/org/apache/beam/runners/local/Bundle.java:53:
>  error: reference not found
>* processing time \{@link TimerData timer} at the time this bundle was 
> committed, including any



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4144) SplittableParDoProcessFnTest has had a masked failure

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4144?focusedWorklogId=97220&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97220
 ]

ASF GitHub Bot logged work on BEAM-4144:


Author: ASF GitHub Bot
Created on: 01/May/18 20:36
Start Date: 01/May/18 20:36
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5250: [BEAM-4144] Fixes 
SplittableParDoProcessFnTest
URL: https://github.com/apache/beam/pull/5250#issuecomment-385782345
 
 
   LGTM when it passes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97220)
Time Spent: 20m  (was: 10m)

> SplittableParDoProcessFnTest has had a masked failure
> -
>
> Key: BEAM-4144
> URL: https://issues.apache.org/jira/browse/BEAM-4144
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Eugene Kirpichov
>Priority: Major
>  Labels: sickbay
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #5186: Findbugs cleanup; no functional changes

2018-05-01 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 643e3da528b7147c1e56023ac4868ad254c6cdaf
Merge: a0036d5 f6d7c21
Author: Thomas Groh 
AuthorDate: Tue May 1 13:41:26 2018 -0700

Merge pull request #5186: Findbugs cleanup; no functional changes

 .../fnexecution/control/SdkHarnessClientTest.java  |  2 +-
 .../apache/beam/sdk/coders/DoubleCoderTest.java|  4 ++--
 .../sdk/options/ProxyInvocationHandlerTest.java| 22 +++---
 .../sdk/transforms/display/DisplayDataTest.java|  6 +++---
 4 files changed, 17 insertions(+), 17 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[beam] branch master updated (a0036d5 -> 643e3da5)

2018-05-01 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from a0036d5  [BEAM-3042] Refactor of TransformIOCounters (performance, 
inheritance). (#5075)
 add f6d7c21  Findbugs cleanup; no functional changes
 new 643e3da5 Merge pull request #5186: Findbugs cleanup; no functional 
changes

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../fnexecution/control/SdkHarnessClientTest.java  |  2 +-
 .../apache/beam/sdk/coders/DoubleCoderTest.java|  4 ++--
 .../sdk/options/ProxyInvocationHandlerTest.java| 22 +++---
 .../sdk/transforms/display/DisplayDataTest.java|  6 +++---
 4 files changed, 17 insertions(+), 17 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #249

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[github] [BEAM-3042] Refactor of TransformIOCounters (performance, inheritance).

--
[...truncated 404.95 KB...]
[INFO] 
Expiring Daemon because JVM Tenured space is exhausted
:beam-sdks-java-maven-archetypes-examples:generateAndBuildArchetypeTest 
(Thread[Task worker for ':' Thread 4,5,main]) completed. Took 1 mins 17.058 
secs.
:beam-sdks-java-io-elasticsearch:compileJava (Thread[Task worker for ':' Thread 
4,5,main]) started.

> Task :beam-sdks-java-io-elasticsearch:compileJava
Build cache key for task ':beam-sdks-java-io-elasticsearch:compileJava' is 
deceaf7f645c42d66b7353db4c7e39d7
Task ':beam-sdks-java-io-elasticsearch:compileJava' is not up-to-date because:
  No history is available.
Custom actions are attached to task 
':beam-sdks-java-io-elasticsearch:compileJava'.
All input files are considered out-of-date for incremental task 
':beam-sdks-java-io-elasticsearch:compileJava'.
Compiling with error-prone compiler

Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted
Expiring Daemon because JVM Tenured space is exhausted

> Task :beam-sdks-java-extensions-sorter:compileJava
An exception has occurred in the compiler ((version info not available)). 
Please file a bug against the Java compiler via the Java bug reporting page 
(http://bugreport.java.com) after checking the Bug Database 
(http://bugs.java.com) for duplicates. Include your program and the following 
diagnostic in your report. Thank you.
java.lang.OutOfMemoryError: Java heap space
at com.sun.tools.javac.util.ByteBuffer.(ByteBuffer.java:59)
at com.sun.tools.javac.jvm.ClassWriter.(ClassWriter.java:122)
at com.sun.tools.javac.jvm.ClassWriter.instance(ClassWriter.java:166)

Expiring Daemon because JVM Tenured space is exhausted

> Task :beam-sdks-java-extensions-sorter:compileJava
at com.sun.tools.javac.comp.Modules.(Modules.java:198)
at com.sun.tools.javac.comp.Modules.instance(Modules.java:174)
at com.sun.tools.javac.code.Symtab.(Symtab.java:481)
at com.sun.tools.javac.code.Symtab.instance(Symtab.java:88)
at com.sun.tools.javac.comp.Attr.(Attr.java:128)
at com.sun.tools.javac.comp.Attr.instance(Attr.java:119)
at com.sun.tools.javac.comp.Annotate.(Annotate.java:105)
at com.sun.tools.javac.comp.Annotate.instance(Annotate.java:80)
at com.sun.tools.javac.jvm.ClassReader.(ClassReader.java:252)
at com.sun.tools.javac.jvm.ClassReader.instance(ClassReader.java:245)
at com.sun.tools.javac.code.ClassFinder.(ClassFinder.java:183)
at com.sun.tools.javac.code.ClassFinder.instance(ClassFinder.java:176)
at com.sun.tools.javac.main.JavaCompiler.(JavaCompiler.java:379)
at com.sun.tools.javac.main.JavaCompiler.instance(JavaCompiler.java:112)
at 
com.sun.tools.javac.api.JavacTaskImpl.prepareCompiler(JavacTaskImpl.java:196)
at 
com.sun.tools.javac.api.JavacTaskImpl.lambda$doCall$0(JavacTaskImpl.java:97)
at 
com.sun.tools.javac.api.JavacTaskImpl$$Lambda$1377/678085447.call(Unknown 
Source)
at 
com.sun.tools.javac.api.JavacTaskImpl.handleExceptions(JavacTaskImpl.java:142)
at com.sun.tools.javac.api.JavacTaskImpl.doCall(JavacTaskImpl.java:96)
at com.sun.tools.javac.api.JavacTaskImpl.call(JavacTaskImpl.java:90)

Expiring Daemon because JVM Tenured space is exhausted

> Task :beam-sdks-java-extensions-sorter:compileJava
at 
com.google.errorprone.BaseErrorProneCompiler.run(BaseErrorProneCompiler.java:137)
at 
com.google.errorprone.BaseErrorProneCompiler.run(BaseErrorProneCompiler.java:108)
at 
com.google.errorprone.ErrorProneCompiler.run(ErrorProneCompiler.java:118)
at 
com.google.errorprone.ErrorProneCompiler.compile(ErrorProneCompiler.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
net.ltgt.gradle.errorprone.ErrorProneCompiler.execute(ErrorProneCompiler.java:66)

> Task :beam-sdks-java-core:compileTestJava
An exception has occurred in the compiler ((version info not available)). 
Please file a bug against the Java compiler via the Java bug reporting page 
(http://b

[jira] [Work logged] (BEAM-3972) Flink runner translates batch pipelines directly by proto

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3972?focusedWorklogId=97225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97225
 ]

ASF GitHub Bot logged work on BEAM-3972:


Author: ASF GitHub Bot
Created on: 01/May/18 20:58
Start Date: 01/May/18 20:58
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on issue #5226: [BEAM-3972] Translate 
portable batch pipelines by proto
URL: https://github.com/apache/beam/pull/5226#issuecomment-385788230
 
 
   Run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97225)
Time Spent: 5h 20m  (was: 5h 10m)

> Flink runner translates batch pipelines directly by proto
> -
>
> Key: BEAM-3972
> URL: https://issues.apache.org/jira/browse/BEAM-3972
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The non-portable runner uses reydrated pipelines which lack necessary 
> information. The portable Flink runner needs to translate pipelines directly 
> by proto in order to wire components into individual executable stages 
> correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3883) Python SDK stages artifacts when talking to job server

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3883?focusedWorklogId=97227&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97227
 ]

ASF GitHub Bot logged work on BEAM-3883:


Author: ASF GitHub Bot
Created on: 01/May/18 21:02
Start Date: 01/May/18 21:02
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5251: [BEAM-3883] Refactor 
and clean dependency.py to make it reusable with artifact service
URL: https://github.com/apache/beam/pull/5251#issuecomment-385789245
 
 
   R: @aaltay 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97227)
Time Spent: 50m  (was: 40m)

> Python SDK stages artifacts when talking to job server
> --
>
> Key: BEAM-3883
> URL: https://issues.apache.org/jira/browse/BEAM-3883
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The Python SDK does not currently stage its user-defined functions or 
> dependencies when talking to the job API. Artifacts that need to be staged 
> include the user code itself, any SDK components not included in the 
> container image, and the list of Python packages that must be installed at 
> runtime.
>  
> Artifacts that are currently expected can be found in the harness boot code: 
> [https://github.com/apache/beam/blob/58e3b06bee7378d2d8db1c8dd534b415864f63e1/sdks/python/container/boot.go#L52.]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3883) Python SDK stages artifacts when talking to job server

2018-05-01 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3883?focusedWorklogId=97231&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97231
 ]

ASF GitHub Bot logged work on BEAM-3883:


Author: ASF GitHub Bot
Created on: 01/May/18 21:10
Start Date: 01/May/18 21:10
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5251: [BEAM-3883] Refactor 
and clean dependency.py to make it reusable with artifact service
URL: https://github.com/apache/beam/pull/5251#issuecomment-385791414
 
 
   R: @tvalentyn  Valentyn worked on refactoring this code, he could provide 
good insights. I will also review it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 97231)
Time Spent: 1h  (was: 50m)

> Python SDK stages artifacts when talking to job server
> --
>
> Key: BEAM-3883
> URL: https://issues.apache.org/jira/browse/BEAM-3883
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Python SDK does not currently stage its user-defined functions or 
> dependencies when talking to the job API. Artifacts that need to be staged 
> include the user code itself, any SDK components not included in the 
> container image, and the list of Python packages that must be installed at 
> runtime.
>  
> Artifacts that are currently expected can be found in the harness boot code: 
> [https://github.com/apache/beam/blob/58e3b06bee7378d2d8db1c8dd534b415864f63e1/sdks/python/container/boot.go#L52.]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4220) Java is missing an unbounded example for wordcount series

2018-05-01 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-4220:
-

 Summary: Java is missing an unbounded example for wordcount series
 Key: BEAM-4220
 URL: https://issues.apache.org/jira/browse/BEAM-4220
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Ahmet Altay
Assignee: Ahmet Altay


Python has a streaming wordcount example mentioned in the docs, java does not.

cc: [~melap]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4221) Python is missing a windowed wordcount example

2018-05-01 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-4221:
-

 Summary: Python is missing a windowed wordcount example
 Key: BEAM-4221
 URL: https://issues.apache.org/jira/browse/BEAM-4221
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Ahmet Altay
Assignee: Ahmet Altay


Java has a GCS to GCS windowed wordcount example mentioned in the docs, python 
does not. Since python currently does not support per window writes, we could 
update the existing windowed wordcount example to be GCS to BigQuery to closely 
match Java and website.

cc: [~melap]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >