Re: [DISCUSS] CALCITE-3661, CALCITE-3665, MaterializationTest vs HR schema statistics

2020-01-04 Thread Vladimir Sitnikov
Jin>In ReflectiveSchema, Statistics of FieldTable is given as UNKNOWN[1][2]. Please check[CALCITE-3661] Derive rowCount statistics for tables in ReflectiveSchema that are based on arrays/collections and [CALCITE-3680] Add ability to express unique constraints in ReflectiveSchema commits in

Re: [DISCUSS] CALCITE-3661, CALCITE-3665, MaterializationTest vs HR schema statistics

2020-01-04 Thread XING JIN
Hi, Vladimir ~ In ReflectiveSchema, Statistics of FieldTable is given as UNKNOWN[1][2]. When reading a table's row count, if no statistics given, a default value of 100 will be returned [3] -- this is relatively a bigger value compared with the fields defined in HRFKUKSchema. When a materialized

[DISCUSS] CALCITE-3661, CALCITE-3665, MaterializationTest vs HR schema statistics

2020-01-03 Thread Vladimir Sitnikov
Hi, It looks like MaterializationTest heavily relies on inaccurate statistics for hr.emps and hr.depts tables. I was trying to improve statistic estimation for better join planning (see https://github.com/apache/calcite/pull/1712 ), and it looks like better estimates open the eyes of the