There is a fluentd application that uploads the log files to Stackdriver
and then deletes the files that is separate from the Java code.

If you log way too much, you have to decide to reduce logging or block the
processing of your pipeline. You could add a root logging handler that
monitors diskspace and slows down logging when too much diskspace is used.

On Fri, Aug 30, 2019 at 9:49 AM Talat Uyarer <[email protected]>
wrote:

> Thank you Lukasz for replying back to me. I have long live stream
> processor. I am getting out of disk space error on dataflow worker machines
> after couples days that I submitted.
>
> As you said Dataflow worker provides very limited logging options. However
> I checked code of dataflow worker's logging handler It looks like each time
> create a new log file but it does not delete previous log files or does not
> limit file creation for logs. Could this be a bug for dataflow worker
> logging ?
>
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingInitializer.java#L41
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/logging/DataflowWorkerLoggingHandler.java
>
> Thanks
>
> On Thu, Aug 29, 2019 at 9:44 PM Lukasz Cwik <[email protected]> wrote:
>
>> The only logging options that Dataflow exposes today limit what gets
>> logged and not anything about how many rotated logs there are or how big
>> they are.
>>
>> All Dataflow logging options are available here:
>> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowWorkerLoggingOptions.java
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_beam_blob_master_runners_google-2Dcloud-2Ddataflow-2Djava_src_main_java_org_apache_beam_runners_dataflow_options_DataflowWorkerLoggingOptions.java&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=arH8HGOgLWDsll2Qps80JI6SWWgq15tNuuSDE9DNiQc&s=ouENg15eTeTkuJsO8gn4tDCp__yxdO6ZJ4dc5PK_poE&e=>
>>
>> On Wed, Aug 28, 2019 at 7:00 PM Talat Uyarer <
>> [email protected]> wrote:
>>
>>> Hi All,
>>>
>>> This is my first message for this maillist. Please let me know if I am
>>> sending this message to wrong maillist.
>>>
>>> My stream processing job are running on Google Cloud Dataflow engine.
>>> For logging I am using Stackdriver. I added runtime slf4j-jdk14 and
>>> slf4j-api to enable to stackdriver. However my pipeline create lots of logs
>>> and my instances are getting out of space issue. I checked log rotating and
>>> limit count of log files. I could not find any settings for them. How can I
>>> set this settings.
>>>
>>> Thanks
>>>
>>>

Reply via email to