subject:"DataStream API in Batch Execution mode"

DataStream API in Batch Execution mode

2021-06-07 Thread Marco Villalobos

How do I use a hierarchical directory structure as a file source in S3 when
using the DataStream API in Batch Execution mode?

I have been trying to find out if the API supports that, because currently
our data is organized by years, halves, quarters, months, and but before I
launch the job, I flatten the file structure just to process the right set
of files.

Re: DataStream API in Batch Execution mode

2021-06-07 Thread Guowei Ma

Hi, Macro

I think you could try the `FileSource` and you could find an example from
[1]. The `FileSource` would scan the file under the given
directory recursively.
Would you mind opening an issue for lacking the document?

[1]
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/src/FileSourceTextLinesITCase.java
Best,
Guowei

On Tue, Jun 8, 2021 at 5:59 AM Marco Villalobos 
wrote:

> How do I use a hierarchical directory structure as a file source in S3
> when using the DataStream API in Batch Execution mode?
>
> I have been trying to find out if the API supports that, because currently
> our data is organized by years, halves, quarters, months, and but before I
> launch the job, I flatten the file structure just to process the right set
> of files.
>
>
>

Re: DataStream API in Batch Execution mode

2021-06-09 Thread Marco Villalobos

That worked.  Thank you very much.

On Mon, Jun 7, 2021 at 9:23 PM Guowei Ma  wrote:

> Hi, Macro
>
> I think you could try the `FileSource` and you could find an example from
> [1]. The `FileSource` would scan the file under the given
> directory recursively.
> Would you mind opening an issue for lacking the document?
>
> [1]
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/src/FileSourceTextLinesITCase.java
> Best,
> Guowei
>
>
> On Tue, Jun 8, 2021 at 5:59 AM Marco Villalobos 
> wrote:
>
>> How do I use a hierarchical directory structure as a file source in S3
>> when using the DataStream API in Batch Execution mode?
>>
>> I have been trying to find out if the API supports that,
>> because currently our data is organized by years, halves, quarters, months,
>> and but before I launch the job, I flatten the file structure just to
>> process the right set of files.
>>
>>
>>

DataStream API in Batch Execution mode

Re: DataStream API in Batch Execution mode

Re: DataStream API in Batch Execution mode

3 matches

Site Navigation

Mail list logo

Footer information