Hi, Rui.
Overall LGTM now.
Thanks for driving this again!

On Wed, Nov 15, 2023 at 2:07 PM Rui Fan <1996fan...@gmail.com> wrote:

> Hi Hangxiang,
>
> Thanks for your feedback!
>
> > I saw some users may debug operator correctness using print() or
> System.out
> > which may bring lots of info.
> > It's fine if files are separate, otherwise these will make
> taskMnanager.log
> > roll quickly.
> > But I think it's fine if we don't use it as default first.
>
> Got it, we will support redirect System.out to LOG, and keep
> original behaviour(DEFAULT enum) as default value.
>
>
> > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > userLogContext)?
> >
> > Yes, That's what I mean.
>
> Sounds make sense, I added a new option to FLIP[1],
> it's taskmanager.system-out.log.thread-name.enabled: false.
>
> Why log the thread name here?
>
> Flink's RichFunction can get subtaskId, but redirecting
> System.out to LOG is public code, so it is difficult to get subtaskId.
> After consideration, a workaround solution is to log thread name,
> because the thread name of the default Task Thread contains the
> task name and subtask id. And this solution can support all threads,
> because some non-task threads may also call println.
>
> [1] https://cwiki.apache.org/confluence/x/4guZE
>
> Best,
> Rui
>
> On Wed, Nov 15, 2023 at 10:26 AM Hangxiang Yu <master...@gmail.com> wrote:
>
> > Hi, Rui.
> >
> > As I mentioned before: the user didn't really want
> > > to log into taskmanager.out, it just happened by accident.
> > > So, if users change the System.out to LOG.info, it still happen.
> >
> > Thanks for the feedback.
> > I saw some users may debug operator correctness using print() or
> System.out
> > which may bring lots of info.
> > It's fine if files are separate, otherwise these will make
> taskMnanager.log
> > roll quickly.
> > But I think it's fine if we don't use it as default first.
> >
> > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > userLogContext)?
> > >
> >
> > Yes, That's what I mean.
> >
> > On Sat, Nov 11, 2023 at 12:54 AM Piotr Nowojski <
> piotr.nowoj...@gmail.com>
> > wrote:
> >
> > > Thanks! :)
> > >
> > > Best, Piotrek
> > >
> > > czw., 9 lis 2023 o 16:15 Rui Fan <1996fan...@gmail.com> napisał(a):
> > >
> > > > Hi Piotr,
> > > >
> > > > Thanks for your feedback!
> > > >
> > > > > Or implement your own loop? It shouldn't be more than a couple of
> > > lines.
> > > >
> > > > Implementing it directly is fine, I have updated the FLIP.
> > > > And this logic can be found in the  `isLineEnded` method.
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Thu, Nov 9, 2023 at 11:00 PM Piotr Nowojski <
> > piotr.nowoj...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Rui,
> > > > >
> > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int aFromIndex,
> > int
> > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > >
> > > > > Maybe use `ByteBuffer.wrap`?
> > > > >
> > > > > ByteBuffer.wrap(array, ..., ...).equals(ByteBuffer.wrap(array2,
> ...,
> > > > ...))
> > > > >
> > > > > This shouldn't have overheads as far as I remember.
> > > > >
> > > > > Or implement your own loop? It shouldn't be more than a couple of
> > > lines.
> > > > >
> > > > > Best,
> > > > > Piotrek
> > > > >
> > > > > czw., 9 lis 2023 o 06:43 Rui Fan <1996fan...@gmail.com>
> napisał(a):
> > > > >
> > > > > > Hi Piotr, Archit, Feng and Hangxiang:
> > > > > >
> > > > > > Thanks a lot for your feedback!
> > > > > >
> > > > > > Following is my comment, please correct me if I misunderstood
> > > anything!
> > > > > >
> > > > > > To Piotr:
> > > > > >
> > > > > > > Is there a reason why you are suggesting to copy out bytes from
> > > `buf`
> > > > > to
> > > > > > `bytes`,
> > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int
> > > > aToIndex,
> > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > >
> > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int aFromIndex,
> > int
> > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > > >
> > > > > > Also, this code doesn't run in production. As the comment of
> > > > > > System.lineSeparator():
> > > > > >
> > > > > > > On UNIX systems, it returns {@code "\n"}; on Microsoft
> > > > > > > Windows systems it returns {@code "\r\n"}.
> > > > > >
> > > > > > So Mac and Linux just return one character, we will compare
> > > > > > one byte directly.
> > > > > >
> > > > > >
> > > > > >
> > > > > > To Feng:
> > > > > >
> > > > > > > Will they be written to the taskManager.log file by default
> > > > > > > or the taskManager.out file?
> > > > > >
> > > > > > I prefer LOG as the default value for
> taskmanager.system-out.mode.
> > > > > > It's useful for job stability and doesn't introduce significant
> > > impact
> > > > to
> > > > > > users. Also, our production has already used this feature for
> > > > > > more than 1 years, it works well.
> > > > > >
> > > > > > However, I write the DEFAULT as the default value for
> > > > > > taskmanager.system-out.mode, because when the community
> introduces
> > > > > > new options, the default value often selects the original
> behavior.
> > > > > >
> > > > > > Looking forward to hear more thoughts from community about this
> > > > > > default value.
> > > > > >
> > > > > > > If we can make taskmanager.out splittable and rolling, would it
> > be
> > > > > > > easier for users to use this feature?
> > > > > >
> > > > > > Making taskmanager.out splittable and rolling is a good choice!
> > > > > > I have some concerns about it:
> > > > > >
> > > > > > 1. Users may also want to use LOG.info in their code and just
> > > > > >   accidentally use System.out.println. It is possible that they
> > will
> > > > > >   also find the logs directly in taskmanager.log.
> > > > > > 2. I'm not sure whether the rolling strategy is easy to
> implement.
> > > > > >   If we do it, it's necessary to define a series of flink options
> > > > similar
> > > > > >   to log options, such as: fileMax(how many files should be
> > > retained),
> > > > > >   fileSize(The max size each file), fileNamePatten (The suffix of
> > > file
> > > > > > name),
> > > > > > 3. Check the file size periodically: all logs are written by log
> > > > plugin,
> > > > > >   they can check the log file size after writing. However,
> > System.out
> > > > > >   are written directly. And flink must start a thread to check
> the
> > > > latest
> > > > > >   taskmanager.out size periodically. If it's too quick, most of
> job
> > > > > aren't
> > > > > >   necessary. If it's too slow, the file size cannot be controlled
> > > > > properly.
> > > > > >
> > > > > > Redirect it to LOG.info may be a reasonable and easy choice.
> > > > > > The user didn't really want to log into taskmanager.out, it just
> > > > > > happened by accident.
> > > > > >
> > > > > >
> > > > > >
> > > > > > To Hangxiang:
> > > > > >
> > > > > > > 1. I have a similar concern as Feng. Will we redirect to
> another
> > > log
> > > > > file
> > > > > > > not taskManager.log ?
> > > > > >
> > > > > > Please see my last comment, thanks!
> > > > > >
> > > > > > > taskManager.log contains lots of important information like
> init
> > > log.
> > > > > It
> > > > > > > will be rolled quickly if we redirect out and error here.
> > > > > >
> > > > > > IIUC, this issue isn't caused by System.out, and it can happen if
> > > user
> > > > > > call a lot of LOG.info. As I mentioned before: the user didn't
> > really
> > > > > want
> > > > > > to log into taskmanager.out, it just happened by accident.
> > > > > > So, if users change the System.out to LOG.info, it still happen.
> > > > > >
> > > > > > > 2. Since we have redirected to LOG mode, Could we also log the
> > > > subtask
> > > > > > info
> > > > > > > ? It may help us to debug granularly.
> > > > > >
> > > > > > I'm not sure what `log the subtask info` means. Let me confirm
> with
> > > you
> > > > > > first.
> > > > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > > > > userLogContext)?
> > > > > >
> > > > > > Best,
> > > > > > Rui
> > > > > >
> > > > > > On Thu, Nov 9, 2023 at 11:47 AM Hangxiang Yu <
> master...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi, Rui.
> > > > > > > Thanks for the proposal. It sounds reasonable.
> > > > > > > I have some questions, PTAL:
> > > > > > > 1. I have a similar concern as Feng. Will we redirect to
> another
> > > log
> > > > > file
> > > > > > > not taskManager.log ?
> > > > > > > taskManager.log contains lots of important information like
> init
> > > log.
> > > > > It
> > > > > > > will be rolled quickly if we redirect out and error here.
> > > > > > > 2. Since we have redirected to LOG mode, Could we also log the
> > > > subtask
> > > > > > info
> > > > > > > ? It may help us to debug granularly.
> > > > > > >
> > > > > > > On Thu, Nov 9, 2023 at 9:47 AM Feng Jin <jinfeng1...@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi, Rui.
> > > > > > > >
> > > > > > > > Thank you for initiating this proposal.
> > > > > > > >
> > > > > > > > I have a question regarding redirecting stdout and stderr to
> > LOG:
> > > > > > > >
> > > > > > > > Will they be written to the taskManager.log file by default
> or
> > > the
> > > > > > > > taskManager.out file?
> > > > > > > > If we can make taskmanager.out splittable and rolling, would
> it
> > > be
> > > > > > easier
> > > > > > > > for users to use this feature?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Feng
> > > > > > > >
> > > > > > > > On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal
> > > > > > <argo...@linkedin.com.invalid
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Rui,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal.
> > > > > > > > >
> > > > > > > > > The proposed solution of supporting System out and err to
> be
> > > > > > redirected
> > > > > > > > to
> > > > > > > > > LOG or discarded and introducing an enum and two options to
> > > > manage
> > > > > > > this,
> > > > > > > > > seems reasonable.
> > > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Archit Goyal
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > From: Piotr Nowojski <pnowoj...@apache.org>
> > > > > > > > > Date: Wednesday, November 8, 2023 at 7:38 AM
> > > > > > > > > To: dev@flink.apache.org <dev@flink.apache.org>
> > > > > > > > > Subject: Re: [DISCUSS] FLIP-390: Support System out and err
> > to
> > > be
> > > > > > > > > redirected to LOG or discarded
> > > > > > > > > Hi Rui,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal.
> > > > > > > > >
> > > > > > > > > +1 I don't have any major comments :)
> > > > > > > > >
> > > > > > > > > One nit. In `SystemOutRedirectToLog` in this code:
> > > > > > > > >
> > > > > > > > >            System.arraycopy(buf, count -
> > LINE_SEPARATOR_LENGTH,
> > > > > > bytes,
> > > > > > > 0,
> > > > > > > > > LINE_SEPARATOR_LENGTH);
> > > > > > > > >             return Arrays.equals(LINE_SEPARATOR_BYTES,
> bytes)
> > > > > > > > >
> > > > > > > > > Is there a reason why you are suggesting to copy out bytes
> > from
> > > > > `buf`
> > > > > > > to
> > > > > > > > > `bytes`,
> > > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex,
> int
> > > > > > aToIndex,
> > > > > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Piotrek
> > > > > > > > >
> > > > > > > > > śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com>
> > > > napisał(a):
> > > > > > > > >
> > > > > > > > > > Hi all!
> > > > > > > > > >
> > > > > > > > > > I would like to start a discussion of FLIP-390: Support
> > > System
> > > > > out
> > > > > > > and
> > > > > > > > > err
> > > > > > > > > > to be redirected to LOG or discarded[1].
> > > > > > > > > >
> > > > > > > > > > In various production environments, either cloud native
> or
> > > > > physical
> > > > > > > > > > machines, the disk space that Flink TaskManager can use
> is
> > > > > limited.
> > > > > > > > > >
> > > > > > > > > > In general, the flink users shouldn't use the
> > > > > `System.out.println`
> > > > > > in
> > > > > > > > > > production,
> > > > > > > > > > however this may happen when the number of Flink jobs and
> > job
> > > > > > > > developers
> > > > > > > > > > is very large. Flink job may use System.out to output a
> > large
> > > > > > amount
> > > > > > > of
> > > > > > > > > > data
> > > > > > > > > > to the taskmanager.out file. This file will not roll, it
> > will
> > > > > > always
> > > > > > > > > > increment.
> > > > > > > > > > Eventually the upper limit of what the TM can be used for
> > is
> > > > > > reached.
> > > > > > > > > >
> > > > > > > > > > We can support System out and err to be redirected to LOG
> > or
> > > > > > > discarded,
> > > > > > > > > > the LOG can roll and won't increment forever.
> > > > > > > > > >
> > > > > > > > > > This feature is useful for SREs who maintain Flink
> > > > environments,
> > > > > > they
> > > > > > > > can
> > > > > > > > > > redirect System.out to LOG by default. Although the cause
> > of
> > > > this
> > > > > > > > problem
> > > > > > > > > > is
> > > > > > > > > > that the user's code is not standardized, for SRE,
> pushing
> > > > users
> > > > > to
> > > > > > > > > modify
> > > > > > > > > > the code one by one is usually a very time-consuming
> > > operation.
> > > > > > It's
> > > > > > > > also
> > > > > > > > > > useful for job stability where System.out is accidentally
> > > used.
> > > > > > > > > >
> > > > > > > > > > Looking forward to your feedback, thanks~
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0
> > > > > > > > > <https://cwiki.apache.org/confluence/x/4guZE>
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Rui
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best,
> > > > > > > Hangxiang.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best,
> > Hangxiang.
> >
>


-- 
Best,
Hangxiang.

Reply via email to