Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-15 文章 daniel sun
退订

On Thu, Jun 15, 2023 at 7:23 PM im huzi  wrote:

> 退订
>
> On Tue, Jun 13, 2023 at 08:51 casel.chen  wrote:
>
> > 线上跑了200多个flink
> >
> sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。
> > flink
> >
> sql作业的指标名称一般是作业名称+算子名称组成的,而算子名称是由sql内容拼出来的,在select字段比较多或sql较复杂的情况下容易生成过长的名称,
> > 请问这个问题有什么好的办法解决吗?
>


Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-15 文章 im huzi
退订

On Tue, Jun 13, 2023 at 08:51 casel.chen  wrote:

> 线上跑了200多个flink
> sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。
> flink
> sql作业的指标名称一般是作业名称+算子名称组成的,而算子名称是由sql内容拼出来的,在select字段比较多或sql较复杂的情况下容易生成过长的名称,
> 请问这个问题有什么好的办法解决吗?


Re: Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-14 文章 Feng Jin
配置参数之后, task name 也会简化.


Best,
Feng

On Wed, Jun 14, 2023 at 11:23 AM casel.chen  wrote:

>
>
>
>
>
>
>
>
>
>
>
>
> 谢谢,除了operator name,我看了flink sql作业生成的task name也很长,目前有办法可以简化下吗?例如
>
>
> flink_taskmanager_job_task_operator_fetch_total{job_id="4c24ce399f369ba2b7ae5ce51ec034d3",task_id="5c4ca2fea30dcf09bf3ee40c495fe808",task_attempt_id="5110227bf582bd21ecf6102625fadc16",host="172_19_197_35",operator_id="5c4ca2fea30dcf09bf3ee40c495fe808",operator_name="Source:_TableSourceScan_table___hive__default__top_trans_orderfields__acc_sp",task_name="Source:_TableSourceScan_table___hive__default__top_trans_orderfields__acc_split_bunch__acct_code__acct_fee_date__acct_finish_time__acct_id__acct_message__acct_stat__acct_trans_date__acct_trans_id__acqr_inst_id__actual_pay_channel__actual_pay_channel_sub_mer_id__agent_id__area_info__atu_sub_mer_id__auth_flag__auth_no__bagent_id__bagent_name__bank_date__bank_id__bank_mer_id__bank_mer_name__bank_name__bank_resp_code__bank_resp_desc__bank_seq_id__bank_term_id__bank_type__batch_id__busscode__busstype__card_bank_id__card_channel_type__card_sign__cash_req_date__cash_resp_code__cash_resp_desc__cash_trans_id__cashier_amt__cashier_version__channel_code__channel_finish_time__channel_message__channel_stat__channel_type__check_cash_date__check_cash_flag__chk_time__close_trans_stat__cloud_pay__correct_stat__create_time__creator__credit_fee_amt__credit_type__db_unit__dc_response__dc_type__debit_fee_amt__debit_fee_formula__dev_type__devs_id__discount_amt__div_info__double_exempt__double_limit_amt__fee_acct_id__fee_allowance_flag__fee_amt__fee_flag__fee_formula__fee_huifu_id__fee_member_id__fee_real_acct_id__fee_real_cust_id__fee_rec_type__fee_source__fee_split_type__fq_fee_amt__fq_mer_discount_flag__fq_ref_fee_amt__gate_id__goods_desc__helipay_fee_account_amt__helipay_fee_rate__hf_seq_id__huifu_id__icc_data__id__is_acct_div__is_acct_div_param__is_delay_acct__is_deleted__is_route__iss_inst_id__labels__lc__market_flag__maze_bg_date__maze_bg_seq_id__maze_pnr_dev_id__maze_resp_code__maze_resp_desc__mcc__mer_info__mer_name__mer_oper_id__mer_ord_id__mer_priv__modifier__modify_time__mypaytsf_discount__network__oper_type__ord_amt__ord_id__org_acct_id__org_auth_code__org_auth_no__org_huifu_seq_id__org_ord_id__org_trans_date__out_ord_id__out_trans_id__pa_mer_id__pa_product_id__pa_trans_id__par__party_order_id__pay_amt__pay_card_id__pay_card_id_enc__pay_channel__pay_channel_id__pay_scene__pay_type__pnr_dev_id__pos_mer_id__pos_mer_name__pos_term_id__posp_seq_id__product_id__promotion_detail__real_acct_id__real_cust_id__real_gate_id__real_pay_type__ref_amt__ref_cnt__ref_fee_amt__ref_num__region_id__remark__req_date__req_seq_id__route_mer_id__route_region_id__route_terminal_id__send_time__settle_amt__settle_trans_stat__shop_name__sn_code__source_region_id__subsidy_amt__subsidy_ref_amt__subsidy_stat__sys_id__sys_trace_audit_num__term_batch_id__term_div_coupon_type__time_expire__trans_close_notify_url__trans_date__trans_finish_time__trans_notify_url__trans_stat__trans_type__un_scene_info__unconfirm_amt__unconfirm_fee_amt__version__Calc_select__huifu_id__trans_dateUTF_16LE_top:base:channel_merch_product_relationhuifu_idUTF_16LE__product_id__AS__f184___whereproduct_id_UTF_16LE_MCS_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___AND__trans_statUTF_16LE_S_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE_LookupJoin_table__hive_default_redis_dim_channel_merch_product_relation___joinType__InnerJoin___async__false___lookup__product_id__f184___where___channel_id_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE_select__huifu_id__trans_date___f184__product_id__channel_id__Calc_select__trans_date__channel_id__UTF_16LE_top:base:org_infohuifu_id__AS__f189__LookupJoin_table__hive_default_redis_dim_org_info___joinType__LeftOuterJoin___async__false___lookup__org_cust_id__f189___select__trans_date__channel_id___f189__org_cust_id__huifu_fst_org__huifu_sec_org__huifu_thd_org__huifu_for_org__huifu_sales_sub__Calc_select__trans_date_AS_transDate__channel_id_AS_serviceId__CASE__huifu_fst_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___OR__TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org_UTF_16LE_null_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE__UTF_16LE_defalut_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org___AS_huifuFstOrg__CASE__huifu_sec_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___OR__TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org_UTF_16LE_null_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE__UTF_16LE_defalut_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org___AS_huifuSecOrg__CASE__huifu_thd_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_thd_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET

Re:Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-13 文章 casel.chen












谢谢,除了operator name,我看了flink sql作业生成的task name也很长,目前有办法可以简化下吗?例如


flink_taskmanager_job_task_operator_fetch_total{job_id="4c24ce399f369ba2b7ae5ce51ec034d3",task_id="5c4ca2fea30dcf09bf3ee40c495fe808",task_attempt_id="5110227bf582bd21ecf6102625fadc16",host="172_19_197_35",operator_id="5c4ca2fea30dcf09bf3ee40c495fe808",operator_name="Source:_TableSourceScan_table___hive__default__top_trans_orderfields__acc_sp",task_name="Source:_TableSourceScan_table___hive__default__top_trans_orderfields__acc_split_bunch__acct_code__acct_fee_date__acct_finish_time__acct_id__acct_message__acct_stat__acct_trans_date__acct_trans_id__acqr_inst_id__actual_pay_channel__actual_pay_channel_sub_mer_id__agent_id__area_info__atu_sub_mer_id__auth_flag__auth_no__bagent_id__bagent_name__bank_date__bank_id__bank_mer_id__bank_mer_name__bank_name__bank_resp_code__bank_resp_desc__bank_seq_id__bank_term_id__bank_type__batch_id__busscode__busstype__card_bank_id__card_channel_type__card_sign__cash_req_date__cash_resp_code__cash_resp_desc__cash_trans_id__cashier_amt__cashier_version__channel_code__channel_finish_time__channel_message__channel_stat__channel_type__check_cash_date__check_cash_flag__chk_time__close_trans_stat__cloud_pay__correct_stat__create_time__creator__credit_fee_amt__credit_type__db_unit__dc_response__dc_type__debit_fee_amt__debit_fee_formula__dev_type__devs_id__discount_amt__div_info__double_exempt__double_limit_amt__fee_acct_id__fee_allowance_flag__fee_amt__fee_flag__fee_formula__fee_huifu_id__fee_member_id__fee_real_acct_id__fee_real_cust_id__fee_rec_type__fee_source__fee_split_type__fq_fee_amt__fq_mer_discount_flag__fq_ref_fee_amt__gate_id__goods_desc__helipay_fee_account_amt__helipay_fee_rate__hf_seq_id__huifu_id__icc_data__id__is_acct_div__is_acct_div_param__is_delay_acct__is_deleted__is_route__iss_inst_id__labels__lc__market_flag__maze_bg_date__maze_bg_seq_id__maze_pnr_dev_id__maze_resp_code__maze_resp_desc__mcc__mer_info__mer_name__mer_oper_id__mer_ord_id__mer_priv__modifier__modify_time__mypaytsf_discount__network__oper_type__ord_amt__ord_id__org_acct_id__org_auth_code__org_auth_no__org_huifu_seq_id__org_ord_id__org_trans_date__out_ord_id__out_trans_id__pa_mer_id__pa_product_id__pa_trans_id__par__party_order_id__pay_amt__pay_card_id__pay_card_id_enc__pay_channel__pay_channel_id__pay_scene__pay_type__pnr_dev_id__pos_mer_id__pos_mer_name__pos_term_id__posp_seq_id__product_id__promotion_detail__real_acct_id__real_cust_id__real_gate_id__real_pay_type__ref_amt__ref_cnt__ref_fee_amt__ref_num__region_id__remark__req_date__req_seq_id__route_mer_id__route_region_id__route_terminal_id__send_time__settle_amt__settle_trans_stat__shop_name__sn_code__source_region_id__subsidy_amt__subsidy_ref_amt__subsidy_stat__sys_id__sys_trace_audit_num__term_batch_id__term_div_coupon_type__time_expire__trans_close_notify_url__trans_date__trans_finish_time__trans_notify_url__trans_stat__trans_type__un_scene_info__unconfirm_amt__unconfirm_fee_amt__version__Calc_select__huifu_id__trans_dateUTF_16LE_top:base:channel_merch_product_relationhuifu_idUTF_16LE__product_id__AS__f184___whereproduct_id_UTF_16LE_MCS_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___AND__trans_statUTF_16LE_S_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE_LookupJoin_table__hive_default_redis_dim_channel_merch_product_relation___joinType__InnerJoin___async__false___lookup__product_id__f184___where___channel_id_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE_select__huifu_id__trans_date___f184__product_id__channel_id__Calc_select__trans_date__channel_id__UTF_16LE_top:base:org_infohuifu_id__AS__f189__LookupJoin_table__hive_default_redis_dim_org_info___joinType__LeftOuterJoin___async__false___lookup__org_cust_id__f189___select__trans_date__channel_id___f189__org_cust_id__huifu_fst_org__huifu_sec_org__huifu_thd_org__huifu_for_org__huifu_sales_sub__Calc_select__trans_date_AS_transDate__channel_id_AS_serviceId__CASE__huifu_fst_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___OR__TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org_UTF_16LE_null_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE__UTF_16LE_defalut_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___TRIM_FLAG_BOTHUTF_16LE_huifu_fst_org___AS_huifuFstOrg__CASE__huifu_sec_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___OR__TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org_UTF_16LE_null_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE__UTF_16LE_defalut_:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___TRIM_FLAG_BOTHUTF_16LE_huifu_sec_org___AS_huifuSecOrg__CASE__huifu_thd_org_IS_NULL_OR__TRIM_FLAG_BOTHUTF_16LE_huifu_thd_org_UTF_16LE__:VARCHAR_2147483647__CHARACTER_SET__UTF_16LE___OR__TRIM_FLAG_BOTHUTF_16LE_huifu_thd_org_UTF_16LE_null_:VARCHAR_2147483647__CHARACTE

Re: flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-12 文章 Feng Jin
hi casel

1. 可以考虑使用 Flink1.15, 使用精简的 operator name

https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/config/#table-exec-simplify-operator-name-enabled

2.  Flink 也提供了 restful 接口直接获取瞬时的 metric,如果不需要历史的 metric

https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobmanager-metrics


Best,
Feng

On Tue, Jun 13, 2023 at 8:51 AM casel.chen  wrote:

> 线上跑了200多个flink
> sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。
> flink
> sql作业的指标名称一般是作业名称+算子名称组成的,而算子名称是由sql内容拼出来的,在select字段比较多或sql较复杂的情况下容易生成过长的名称,
> 请问这个问题有什么好的办法解决吗?


flink sql作业指标名称过长把prometheus内存打爆问题

2023-06-12 文章 casel.chen
线上跑了200多个flink 
sql作业,接了prometheus指标(prometheus定期来获取作业指标)监控后没跑一会儿就将prometheus内存打爆(开了64GB内存),查了一下是因为指标名称过长导致的。
flink 
sql作业的指标名称一般是作业名称+算子名称组成的,而算子名称是由sql内容拼出来的,在select字段比较多或sql较复杂的情况下容易生成过长的名称,
请问这个问题有什么好的办法解决吗?