Re: [VOTE] Vote for formats of printing performance tracing log file:

Jialin Qiao Sun, 28 Jun 2020 05:44:34 -0700

Hi,

One drawback of one-line is that the information, such as how many tsfiles and 
avg chunk size are not gathered at the same time. The tsfiles could get from 
the very beginning, but the avg chunk size is available only when the query is 
down. If the query spends too long time, we could get nothing until it is 
finished.


So the tracing goal is to print the information as soon as we get it. 

To resolve the concurrent problem, such as query-1, query-2, query-1, this is 
why we add a query id in front of the log.

We could use "grep tracing.log | grep query-id > query-id.log" to get the whole 
tracing logs of a particular query. 

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2020-06-28 16:44:32 (星期日)
> 收件人: dev <[email protected]>
> 抄送: 
> 主题: Re: [VOTE] Vote for formats of printing performance tracing log file:
> 
> > Will multiple lines lead to messy printing under multiple threads.
> 
> I read the related PR. These lines are written by calling writer.write()
> once and the write method is thread-safety...
> 
> But I may agree with you if putting them in one line is friendly for
> analysis.
> 
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> Dawei Liu <[email protected]> 于2020年6月28日周日 下午4:20写道：
> 
> > Hi，
> >
> >
> > I have a question. Why not print all the information on the same line?
> > Will multiple lines lead to messy printing under multiple threads.
> >
> >
> > Example:
> >
> >
> > Query Id: 2 - Start time: 2020-06-23 20:02:22.205
> > Query Id: 6 - Start time: 2020-06-23 20:02:22.205
> >
> > Query Id: 6 - Query Statement: select * from root
> > Query Id: 2 - Query Statement: select * from root
> > Query Id: 2 - Number of series paths: 3
> > Query Id: 2 - Number of tsfiles: 2
> > Query Id: 2 - Number of sequence files: 2
> > Query Id: 6 - Number of series paths: 3
> > Query Id: 2 - Number of unsequence files: 0
> > Query Id: 2 - Number of chunks: 3
> > Query Id: 2 - Average size of chunks: 4113
> >
> >
> >
> >
> >
> > Best,
> > —————————————————
> > Dawei Liu
> > On 06/28/2020 15:51，Xiangdong Huang<[email protected]> wrote：
> > I suggest formatting the log for being easily analyzed in the future.
> >
> > Yes,  it will be better if we can use `grep` or some other Linux commands
> > to get a structured format.
> >
> > Best,
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > 黄向东
> > 清华大学 软件学院
> >
> >
> > Jialin Qiao <[email protected]> 于2020年6月28日周日 上午11:31写道：
> >
> > Hi,
> >
> > [1] is ok. I suggest formatting the log for being easily analyzed in the
> > future.
> >
> > Thanks,
> > --
> > Jialin Qiao
> > School of Software, Tsinghua University
> >
> > 乔嘉林
> > 清华大学 软件学院
> >
> > -----原始邮件-----
> > 发件人: "孙泽嵩" <[email protected]>
> > 发送时间: 2020-06-24 00:45:23 (星期三)
> > 收件人: [email protected]
> > 抄送:
> > 主题: Re: [VOTE] Vote for formats of printing performance tracing log file:
> >
> > Hi Xiangwei,
> >
> > I vote for [1] since it’s more clear and in alignment.
> >
> >
> > Best,
> > -----------------------------------
> > Zesong Sun
> > School of Software, Tsinghua University
> >
> > 孙泽嵩
> > 清华大学 软件学院
> >
> > 2020年6月23日 23:31，Xiangdong Huang <[email protected]> 写道：
> >
> > Hi,
> >
> > Also vote for the first format.
> >
> > By the way, can we add more info, e.g.,
> >
> > - the ratio of reading from (metadata and data) cache
> > - how many chunks that just the headers are read.
> >
> > Best,
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> > 黄向东
> > 清华大学 软件学院
> >
> >
> > Xiangwei Wei <[email protected]> 于2020年6月23日周二 下午9:23写道：
> >
> > Hi, guys
> >
> > Sorry, I put the vote content in another email. I will add it here.
> >
> >
> > Hi, guys
> >
> > I'd like to get a little advice from you.
> >
> > The following are three formats of printing performance tracing log
> > file:
> >
> > [1]
> > -----------------------------
> > Query Id: 2 - Start time: 2020-06-23 20:02:22.205
> > Query Id: 2 - Query Statement: select * from root
> > Query Id: 2 - Number of series paths: 3
> > Query Id: 2 - Number of tsfiles: 2
> > Query Id: 2 - Number of sequence files: 2
> > Query Id: 2 - Number of unsequence files: 0
> > Query Id: 2 - Number of chunks: 3
> > Query Id: 2 - Average size of chunks: 4113
> >
> > [2]
> > -----------------------------
> > Query Id: 2 - Start time: 2020-06-23 19:38:35.109 | Query Statement:
> > select
> > * from root | Number of series paths: 3
> > Query Id: 2 - Number of tsfiles: 2 | Number of sequence files: 2 |
> > Number
> > of unsequence files: 0
> > Query Id: 2 - Number of chunks: 3 | Average size of chunks: 4113
> >
> > [3]
> > -----------------------------
> > Query Id: 2 - Start time: 2020-06-23 19:38:35.109, Query Statement:
> > select
> > * from root, Number of series paths: 3
> > Query Id: 2 - Number of tsfiles: 2, Number of sequence files: 2,
> > Number of
> > unsequence files: 0
> > Query Id: 2 - Number of chunks: 3, Average size of chunks: 4113
> >
> > Which do you prefer? Any other suggestions are OK.
> >
> > Please leave your opinion.
> >
> > 田原 <[email protected]> 于2020年6月23日周二 下午9:08写道：
> >
> > Hi,
> >
> > xiangwei
> >
> > I prefer the first one.
> >
> > vote for [1].
> >
> > Best,
> > ---------------
> > Yuan Tian
> >
> >
> >
> >
> >
> >
> > --
> > Best,
> > Xiangwei Wei
> >
> >
> >
> >

Re: [VOTE] Vote for formats of printing performance tracing log file:

Reply via email to