Hi Liu,
indeed my current experience migrating old Dataset code to new DataStream
is really frutrating.
It's very complicated to write a Source (unless you use the deprecated
SourceFunction or TableSource that is easier) and some operations are
really complicated because there should not be any windowing involved (like
in this case for outer joins or dataset broadcasting). I hope things will
improve for batch scenarios in the future.

Best,
Flavio

On Wed, Aug 9, 2023 at 4:55 AM liu ron <ron9....@gmail.com> wrote:

> Hi, Flavio
>
> IMO, the current DataStream API is not aligned with DataSet in terms of
> capabilities, I think you can try it with GlobalWindow. Another possible
> solution is to convert the DataStream to a table[1] first and then try it
> with a join on the Table API.
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tableapi/
>
> Best,
> Ron
>
> Flavio Pompermaier <pomperma...@okkam.it> 于2023年8月8日周二 00:23写道:
>
>> Hello everybody,
>> I have a use case where I need to exclude from a DataStream (that is
>> technically a DataSet since I work in batch mode) all already-indexed
>> documents.
>> My idea is to perfrom an outer join but I didn't find any simple example
>> on DataStream working on batch mode..I've tried using coGroup() but then it
>> requires me to specify a windows strategy..in batch mode I would't expect
>> that..can I use global window?
>>
>> Thanks in advance,
>> Flavio
>>
>

Reply via email to