Hi,
I try to use druid to store detailed data, using computing engine such as
presto, spark directly pull data from the druid for sql query. With druid
columnar storage and inverted indexing, I expect it will achieve good query
performance, while supporting real-time data write
Of course, there may be some problems。Here are some of the challenges I have
come up with.
1. Each data requires a time attribute
For dimension tables, the __time column gives a default constant value,
and the fact table generally has a time column.
2. Index Overhead
For a string column that does not use an index, it is forbidden to
build an inverted index, reducing the index creation overhead.
3. data type
I think there are several advantages to the refinement of the data type.
1. It will be more friendly for users who are transferred from
other databases.
2. Proper data types can reduce storage space and of course
accelerate calculations indirectly
Of course, I also think that the current data type is enough.
I am very much looking forward to your views on this solution. Is it feasible?
Thanks