I believe that we are onto an exciting prospect with this idea. Here are
the specific needs that our company could foresee, given the theme:

1. **Transition to Paimon Tables from Hive ODS Tables**: Our current system
boasts a significant number of Hive ODS tables, with partitions set daily.
Each of these partitions encapsulates comprehensive business data sourced
directly from MySQL. We are contemplating an in-place transition to Paimon
tables. The rationale behind this move is twofold: First, it would obviate
the need to modify the SQL code amidst the existing plethora of Hive batch
processing logic. Secondly, this transition promises the advantage of
real-time data access, shrinking the delay to mere minutes and also adding
the benefit of stream reading capabilities.

2. **Integration with Historical Hive Partitions**: The Hive system has
been an integral part of our data structure, with over a thousand
partitions to its credit. Ideally, a view table that can meld the
functionalities of a Paimon table and the vastness of historical Hive
partitions would be a valuable addition. In such a scenario, users
interacting with this view table would be directed to the Paimon tag when a
tag is present, and to the historical Hive partitions in its absence.

3. **Tag-Based Processing with 'dt'**: We employ a tagging system rooted in
the 'dt' parameter. Keeping this in mind, processing using these tags
should ideally support a range of operations, such as "between and",
comparative functions like greater than or less than, and even group by
operations centered around these tags. To illustrate, the system should be
adept at handling queries akin to:
```SQL
SELECT dt, COUNT(*) FROM table WHERE dt BETWEEN a AND b GROUP BY dt
```

Best,
ZhuoyuChen

Jingsong Li <[email protected]> 于2023年8月25日周五 13:58写道:

> Hi, devs.
>
> Now, Pailin supports tags, which provide a snapshot view to time travel,
> this can be something similar to partition table to replace hive full
> partitioned table and incremental partitioned table.
>
> But, this requires uses to change their sql to use time travel, and it is
> not good to use time travel in hive sql now.
>
> So, I plan to create a new feature view table, we can create view table to
> mapping non-partitioned table to partitioned table, it’s partition field is
> tag. This feature can let Pailin table 100% compatible to old hive table.
>
> What do you think?
>
> Any requirements?
>
> Best,
> Jingsong
>

Reply via email to