Re: Joins not joining

2021-11-14 Thread Curt Buechter
ning for joins. Flink's SQL planner > does this for you. If you look into the web UI you'll see an arrow marked > with HASH pointing from sources to join operators. It means that records > flowing through this arrow will be distributed to the corresponding > parallelism according to the

Joins not joining

2021-11-14 Thread Curt Buechter
Hi, I'd like to understand a little more how joins work. I have a fairly simple LEFT JOIN query, and I'm seeing spotty results on the joins. I know there is a record on the right side, but sometimes it produces a result, and sometimes it doesn't. Sample query: SELECT a.id, b.val1, c.val2 FROM

Re: pyflink keyed stream checkpoint error

2021-10-13 Thread Curt Buechter
OOM killer > > messages or process status? > > > > Regards, > > Roman > > > > On Wed, Sep 22, 2021 at 6:30 PM Curt Buechter > wrote: > >> > >> Hi, > >> I'm getting an error after enabling checkpointing in my pyflin

Re: pyflink keyed stream checkpoint error

2021-09-23 Thread Curt Buechter
ble that the python process crashed or hung up? (probably > > performing a snapshot) > > Could you validate this by checking the OS logs for OOM killer > > messages or process status? > > > > Regards, > > Roman > > > > On Wed, Sep 22, 2021 at 6:30 PM C

Re: flink rest endpoint creation failure

2021-09-23 Thread Curt Buechter
, Sep 22, 2021 at 11:46 AM Curt Buechter wrote: > Hi, > I'm getting an error that happens randomly when starting a flink > application. > > For context, this is running in YARN on AWS. This application is one that > converts from the Table API to the Stream API, so two flink

flink rest endpoint creation failure

2021-09-22 Thread Curt Buechter
Hi, I'm getting an error that happens randomly when starting a flink application. For context, this is running in YARN on AWS. This application is one that converts from the Table API to the Stream API, so two flink applications/jobmanagers are trying to start up. I think what happens is that the

pyflink keyed stream checkpoint error

2021-09-22 Thread Curt Buechter
Hi, I'm getting an error after enabling checkpointing in my pyflink application that uses a keyed stream and rocksdb state. Here is the error message: 2021-09-22 16:18:14,408 INFO org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend [] - Closed RocksDB State Backend. Cleaning up

pyflink table to datastream

2021-09-03 Thread Curt Buechter
I have a question about how the conversion from Table API to Datastream API actually works under the covers. If I have a Table API operation that creates a random id, like: SELECT id, CAST(UUID() AS VARCHAR) as random_id FROM table ...then I convert this table to a datastream with

IllegalArgumentException: The configured managed memory fraction for Python worker process must be within (0, 1], was: %s

2021-07-28 Thread Curt Buechter
I have a pyflink job that starts using the Datastream api and converts to the Table API. In the datastream portion, there is a MapFunction. I am getting the following error: flink run -py sample.py java.lang.IllegalArgumentException: The configured managed memory fraction for Python worker

ImportError: No module named pyflink

2021-07-27 Thread Curt Buechter
This feels like the simplest error, but I'm struggling to get past it. I can run pyflink jobs locally just fine by submitting them either via `python sample.py` or `flink run --target local -py sample.py`. But, when I try to execute on a remote worker node, it always fails with this error:

Re: PyFlink kafka producer topic override

2021-06-24 Thread Curt Buechter
ctors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/KafkaContextAware.java#L48 >> >> Regards, >> Dian >> >> 2021年6月24日 上午11:45,Curt Buechter 写道: >> >> Hi Dian, >> Thanks for the reply. >> I don't think a filter fun

Re: PyFlink kafka producer topic override

2021-06-23 Thread Curt Buechter
lit is still not supported. Does it make sense for you to split the stream using a filter function? There is some overhead compared the built-in stream.split as you need to provide a filter function for each sub-stream and so a record will evaluated multiple times.> > > > 2021年6月

PyFlink kafka producer topic override

2021-06-23 Thread Curt Buechter
Hi, New PyFlink user here. Loving it so far. The first major problem I've run into is that I cannot create a Kafka Producer with dynamic topics. I see that this has been available for quite some time in Java with Keyed Serialization using the getTargetTopic method. Another way to do this in Java