Hi team,
I am trying to learn the CBO of hive because I need to make some performance tuning for my ETL job. I find a confluence doc below, but I am not sure if it is the newest version, can anyone help to confirm that? https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive Another question is that we develop some UDTF help us to parse log like: select my-udtf(log) as (id ,name ,time) from tb_log So do you have any other better idea for this scenario? BTW, the version of Hive we used is above 3.0. My data increase by PB every day. Thanks in advance, Samuel