Using the flink CLI option --pyRequirements

2021-10-18 Thread Francis Conroy
just can't submit a pyflink job to my cluster when using the --pyRequirements option. I started going down the line of debugging the flink CLI using intellij idea, but wasn't able to figure out how to make my venv with pyflink installed available to the debug environment. Thanks, Francis Co

FlinkKafkaProducer deprecated in 1.14 but pyflink binding still present?

2021-10-25 Thread Francis Conroy
Looks like this got deprecated in 1.14 in favour of KafkaSink/KafkaSource but the python binding didn't get updated? Can someone confirm this? Francis Conroy -- This email and any attachments are proprietary and confidential and are intended solely for the use of the individual to whom

Re: PyFlink Perfomance

2021-12-21 Thread Francis Conroy
I've just run an analysis using a similar example which involves a single python flatmap operator and we're getting 100x less through by using python over java. I'm interested to know if you can do such a comparison. I'm using Flink 14.0. Thanks, Francis On Thu, 18 Nov 2021 at 02:20, Thomas Portu

Re: FlinkKafkaProducer deprecated in 1.14 but pyflink binding still present?

2021-12-21 Thread Francis Conroy
; Yes, you are right. It's still not updated in PyFlink as > KafkaSource/KafkaSink are still not supported in PyFlink. Hopeful we could > add that support in 1.15 and then we could deprecate/remove the legacy > interfaces. > > Regards, > Dian > > On Tue, Oct 26, 2021

Re: PyFlink Perfomance

2021-12-21 Thread Francis Conroy
Hi Dian, I'll build up something similar and post it, my current test code contains proprietary information. On Wed, 22 Dec 2021 at 14:49, Dian Fu wrote: > Hi Francis, > > Could you share the benchmark code you use? > > Regards, > Dian > > On Wed, Dec 22, 20

pyflink mixed with Java operators

2022-01-06 Thread Francis Conroy
Hi all, Does anyone know if it's possible to specify a java map function at some intermediate point in a pyflink job? In this case SimpleCountMeasurementsPerUUID is a flink java MapFunction. The reason we want to do this is that performance in pyflink seems quite poor. e.g. import logging impor

Re: pyflink mixed with Java operators

2022-01-10 Thread Francis Conroy
DataStream(ds._j_data_stream.map(jvm.com.example.MyJavaMapFunction())) > ) > ``` > > Regards, > Dian > > On Fri, Jan 7, 2022 at 1:00 PM Francis Conroy < > francis.con...@switchdin.com> wrote: > >> Hi all, >> >> Does anyone know if it's possible

Re: MAP data type (PyFlink)

2022-01-30 Thread Francis Conroy
Hi Philippe, I don't think it's quite that simple unfortunately. A python dict can map from any hashable type to any value, however the 'equivalent' POJO, 'Map' in this case, requires all key types to be the same and all value types to be the same. You cannot specify multiple types for the key or

Re: Socket stream source in Python?

2022-01-30 Thread Francis Conroy
Hi Philippe, after checking the source Flink master I think you're right, there is currently no binding from python to Flink socketTextStream (via py4j) in pyFlink. The py4j interface isn't too complicated to modify for some tasks and I suspect that it should be fairly trivial to extend pyflink to

Flink SQL kafka debezium CDC and postgreSQL

2022-02-09 Thread Francis Conroy
Hi all, I'm using flink 1.13.5 (as I was originally using the ververica Flink CDC connector) and am trying to understand something. I'm just using the Flink SQL CLI at this stage to verify that I can stream a PostgreSQL table into Flink SQL to compute a continuous materialised view. I was inspecti

Change column names Pyflink Table/Datastream API

2022-02-15 Thread Francis Conroy
Hi all, I'm hoping to be able to change the column names when creating a table from a datastream, the flatmap function generating the stream is returning a Tuple4. It's currently working as follows: inputmetrics = table_env.from_data_stream(ds, Schema.new_builder()

Re: Change column names Pyflink Table/Datastream API

2022-02-16 Thread Francis Conroy
ple code? Besides, > another way you may try is `inputmetrics.alias("timestamp, device, name, > value")`. > > Regards, > Dian > > On Wed, Feb 16, 2022 at 8:14 AM Francis Conroy < > francis.con...@switchdin.com> wrote: > >> Hi all, >> >> I&

Re: CVE-2021-44228 - Log4j2 vulnerability

2022-02-20 Thread Francis Conroy
The release notification email came out a few days ago. On Mon, 21 Feb 2022 at 14:18, Surendra Lalwani wrote: > Hi Team, > > Any updates on Flink 1.13.6 version release? > > Regards, > Surendra Lalwani > > > On Fri, Feb 4, 2022 at 1:23 PM Martijn Visser > wrote: > >> Hi Surendra, >> >> You ca

Re: Flink 1.15 deduplication view and lookup join

2022-02-22 Thread Francis Conroy
://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/versioned_tables/ > [2] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins > > > > --Original Mail -- > *Sender:*Francis Conroy >

Joining a flink materialised view table with a streaming table

2022-02-22 Thread Francis Conroy
Hi all, I recently put up a question about a deduplication query related to a join and realised that I was probably asking the wrong question. I'm using Flink 1.15-SNAPSHOT (97ddc39945cda9bf1f52ab159852fdb606201cf2) as we're using the RabbitMQ connector with pyflink. We won't go to prod until 1.15

pyflink object to java object

2022-02-24 Thread Francis Conroy
Hi all, we're using pyflink for most of our flink work and are sometimes into a java process function. Our new java process function takes an argument in in the constructor which is a Row containing default values. I've declared my Row in pyflink like this: default_row = Row(ep_uuid="",

Re: pyflink object to java object

2022-02-28 Thread Francis Conroy
> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/types/RowUtils.java# > > Best, > Xingbo > > Francis Conroy 于2022年2月25日周五 14:35写道: > >> Hi all, >> >> we're using pyflink for most of our flink work and are sometimes i

Official Flink operator additional class paths

2022-04-07 Thread Francis Conroy
Hi all, thanks in advance for any tips. I've been trying to specify some additional classpaths in my kubernetes yaml file when using the official flink operator and nothing seems to work. I know the technique for getting my job jar works fine since it's finding the class ok, but I cannot get the

Re: Official Flink operator additional class paths

2022-04-07 Thread Francis Conroy
ader automatically without any configuration. > > BTW, I am not aware of any other bugs which will cause pipeline classpath > not take effect except FLINK-21289[1]. > > [1]. https://issues.apache.org/jira/browse/FLINK-21289 > > Best, > Yang > > Francis Conroy 于2022年4月7日周

Re: UUID on TableAPI

2022-04-25 Thread Francis Conroy
Hi Quynh, Set the savepoint UUID? Is that what you mean? The savepoint UUIDs are issued dynamically when you request them, flink won't know automatically what the last savepoint was, but you can start a new job and restore from a savepoint by passing in the UUID. All that said there are limitatio

Re: UUID on TableAPI

2022-04-25 Thread Francis Conroy
hat can implement in Table API ? > > Best, > Quynh > > > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows > > > > *From: *Francis Conroy > *Sent: *Tuesday, April 26, 2022 7:07 AM > *To: *lan tran > *Cc: *us

Using the official flink operator and kubernetes secrets

2022-04-28 Thread Francis Conroy
Hi all, I'm trying to use a kubernetes secret as a command line argument in my job and the text replacement doesn't seem to be happening. I've verified passing the custom args via the command line on my local flink cluster but can't seem to get the environment var replacement to work. apiVersion:

Re: How should I call external HTTP services with PyFlink?

2022-05-02 Thread Francis Conroy
Hi Dhavan, We have looked at using pyflink for data stream enrichment and found the performance lacking compared to the java counterpart. One option for you might be to use statefun for the enrichment stages. We've also changed our model for enrichment, we're pushing the enrichment data into the p

Re: Using the official flink operator and kubernetes secrets

2022-05-04 Thread Francis Conroy
- "--kafka.sasl.username" > - "$(KAFKA_SASL_USERNAME)" > - "--kafka.sasl.password" > - "$(KAFKA_SASL_PASSWORD)" > ​ > > It would be a great addition, simplifying job startup decision-making > while following existing convent

Re: Decompressing RMQ streaming messages

2022-07-21 Thread Francis Conroy
Hi Venkat, there's nothing that I know of, but I've written a zlib decompressor for our payloads which was pretty straightforward. public class ZlibDeserializationSchema extends AbstractDeserializationSchema { @Override public byte[] deserialize(byte[] message) throws IOException {

Re: Decompressing RMQ streaming messages

2022-07-24 Thread Francis Conroy
ame, however I get > an error. > > Following is the error - > > java.util.zip.DataFormatException: incorrect header check. > > I see multiple errors, i beleive for every message i am seeing this stack > trace? > > Any idea as to what could be causing this? > > T

Re: Handling environment variables in Flink and within the Flink Operator

2022-08-23 Thread Francis Conroy
kSlots: "1" kubernetes.env.secretKeyRef: "env:DJANGO_TOKEN,secret:special-secret-token,key:token" Unfortunately, I don't think this is in the docs yet. Francis Conroy I Software Engineer Level 1, Building B, 91 Parry Street Newcastle NSW 2302 *P* (02) 4786 0426 Ext