Problem with updating beam SDK

2020-02-07 Thread vivek chaurasiya
Hi team, We had beam SDKs 2.5 running on AWS-EMR Spark distribution 5.17. Essentially our beam code was just reading bunch of files from GCS and pushing to ElasticSearch in AWS using beam's class ElasticSearchIO ( https://beam.apache.org/releases/javadoc/2.0.0/index.html?org/apache/beam/sdk/io/el

Re: Running Beam on Flink

2020-02-07 Thread Ankur Goenka
That seems to be a problem. When I try the command, I get $ telnet localhost 8099 Trying ::1... Connected to localhost. Escape character is '^]'. �^CConnection closed by foreign host. On Fri, Feb 7, 2020 at 5:34 PM Xander Song wrote: > Thanks for the response. After entering telnet localhost 8

Re: Running Beam on Flink

2020-02-07 Thread Xander Song
Thanks for the response. After entering telnet localhost 8099, I receive Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused telnet: Unable to connect to remote host On Fri, Feb 7, 2020 at 11:41 AM Anku

Re: Running Beam on Flink

2020-02-07 Thread Ankur Goenka
Seems that pipeline submission from sdk is not able to reach the job server which was started in docker. Can you try running "telnet localhost 8099" to make sure that pipeline submission can reach the job server. On Thu, Feb 6, 2020 at 8:16 PM Xander Song wrote: > I am having difficulty followi

Re: dataflow job was working fine last night and it isn't now

2020-02-07 Thread Alan Krumholz
perfect! thank you! On Fri, Feb 7, 2020 at 10:54 AM Valentyn Tymofieiev wrote: > Thanks for your feedback. We expect that this issue will be fixed in > cloudpickle==1.3.0. Per [1], this release may be available next week. > > After that you can install the fixed version of cloudpickle until the

Re: dataflow job was working fine last night and it isn't now

2020-02-07 Thread Valentyn Tymofieiev
Thanks for your feedback. We expect that this issue will be fixed in cloudpickle==1.3.0. Per [1], this release may be available next week. After that you can install the fixed version of cloudpickle until the AI notebook image picks up the new version. [1] https://github.com/cloudpipe/cloudpickle

Re: Beam-SQL: How to create a UDF that returns a Map ?

2020-02-07 Thread Andrew Pilloud
Thanks for reporting and finding the root cause! Last I heard Calcite was going to start a release shortly. We plan to update once the next version is out. Andrew On Fri, Feb 7, 2020 at 4:38 AM Niels Basjes wrote: > Hi, > > I've done some serious debugging and traced the problem to what seems t

Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster

2020-02-07 Thread Paweł Kordek
Hi I had similar use-case recently, and adding a metadata key solved the issue https://github.com/GoogleCloudDataproc/initialization-actions/pull/334. You keep the original initialization action and add for example (using gcloud) '--metadata flink-snapshot-url=http://mirrors.up.pt/pub/apache/f

Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster

2020-02-07 Thread Ismaël Mejía
+user@beam.apache.org On Fri, Feb 7, 2020 at 12:54 AM Xander Song wrote: > I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I > have followed the instructions at this repo > > to > create

Re: Beam-SQL: How to create a UDF that returns a Map ?

2020-02-07 Thread Niels Basjes
Hi, I've done some serious debugging and traced the problem to what seems to be the root cause. The class org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.type. JavaToSqlTypeConversionRules does not have a mapping from java.util.Map to SqlTypeName.MAP. As a consequence the JavaType(M

Beam-SQL: How to create a UDF that returns a Map ?

2020-02-07 Thread Niels Basjes
Hi, My context: Java 8 , Beam 2.19.0 *TLDR*: How do I create a Beam-SQL UDF that returns a Map ? I have a library ( https://yauaa.basjes.nl ) that people would like to use in combination with Beam-SQL. The essence of this library is that a String goes in an a Key-Value set (Map) comes out. I've