Update:

StarRocks[1] is a next-gen sub-second MPP database for full analysis
scenarios, including multi-dimensional analytics, real-time analytics and
ad-hoc query.  Their team is planning to integrate iceberg tables as
StarRocks external tables in the next month [2], so that people could
connect the data lake and StarRocks warehouse in the same engine.
The excellent performance of StarRocks will also help accelerate the
analysis and access of the iceberg table, I think this is a great thing for
both the iceberg community and the StarRocks community.   I think we can
add an extra project about StarRocks integration work in the apache iceberg
roadmap [3] ?

[1].  https://github.com/StarRocks/starrocks
[2].  https://github.com/StarRocks/starrocks/issues/1030
[3].  https://github.com/apache/iceberg/projects

On Mon, Nov 1, 2021 at 11:52 PM Ryan Blue <b...@tabular.io> wrote:

> I closed the upgrade project and marked the FLIP-27 project priority 1.
> Thanks for all the work to get this done!
>
> On Sun, Oct 31, 2021 at 8:10 PM OpenInx <open...@gmail.com> wrote:
>
>> Update:
>>
>> I think the project  [Flink: Upgrade to 1.13.2][1] in RoadMap can be
>> closed now, because all of the issues have been addressed.
>>
>> [1]. https://github.com/apache/iceberg/projects/12
>>
>> On Tue, Sep 21, 2021 at 6:17 PM Eduard Tudenhoefner <edu...@dremio.com>
>> wrote:
>>
>>> I created a Roadmap section in
>>>  https://github.com/apache/iceberg/pull/3163
>>> <https://github.com/apache/iceberg/pull/3163> that links to the
>>> planning boards that Jack created. I figured it makes sense if we link
>>> available Design Docs directly on those Boards (as was already done),
>>> because then the Design docs are closer to the set of related issues.
>>>
>>> On Mon, Sep 20, 2021 at 10:02 PM Ryan Blue <b...@tabular.io> wrote:
>>>
>>>> Thanks, Jack!
>>>>
>>>> Eduard, I think that's a good idea. We should have a roadmap page as
>>>> well that links to the projects that Jack just created.
>>>>
>>>> On Mon, Sep 20, 2021 at 12:57 PM Jack Ye <yezhao...@gmail.com> wrote:
>>>>
>>>>> It seems like we have reached some consensus around the projects
>>>>> listed here. I have created corresponding Github projects for each:
>>>>> https://github.com/apache/iceberg/projects
>>>>>
>>>>> Related design docs are also linked there.
>>>>>
>>>>> Best,
>>>>> Jack Ye
>>>>>
>>>>> On Sun, Sep 19, 2021 at 11:18 PM Eduard Tudenhoefner <
>>>>> edu...@dremio.com> wrote:
>>>>>
>>>>>> Would it make sense to have a section on the website where we collect
>>>>>> all the links to the design docs/specs as that would be easier to find 
>>>>>> than
>>>>>> searching for things on the ML?
>>>>>>
>>>>>> I was thinking about something like for each component:
>>>>>> * link to the ML discussion
>>>>>> * link to the actual Spec/Design Doc
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> On Fri, Sep 10, 2021 at 11:38 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> At the last sync meeting, we brought up publishing a community
>>>>>>> roadmap and brainstormed the many features and initiatives that the
>>>>>>> community is working on. In this thread, I want to make sure that we 
>>>>>>> have a
>>>>>>> good list of what people are thinking about and I think we should try to
>>>>>>> categorize the projects by size and general priority. When we reach a 
>>>>>>> rough
>>>>>>> agreement, I’ll write this up and post it on the ASF site along with 
>>>>>>> links
>>>>>>> to some projects in Github.
>>>>>>>
>>>>>>> My rationale for attempting to prioritize projects is that if we try
>>>>>>> to do too many things, it will be slower progress across everything 
>>>>>>> rather
>>>>>>> than getting a few important items done. I know that priorities don’t 
>>>>>>> align
>>>>>>> very cleanly in practice, but it is hopefully worth trying. To come up 
>>>>>>> with
>>>>>>> a priority, I’m trying to keep top priority items to a minimum by 
>>>>>>> including
>>>>>>> only one from each group (Spark, Flink, Python, etc.). The remaining 
>>>>>>> items
>>>>>>> are split between priority 2 and 3. Priority 3 is not urgent, including
>>>>>>> things that can be plugged in (like other IO libraries), docs, etc.
>>>>>>> Everything else is priority 2.
>>>>>>>
>>>>>>> That something isn’t priority 1 doesn’t mean it isn’t important or
>>>>>>> progressing, just that it isn’t the current focus. I think of it this 
>>>>>>> way:
>>>>>>> if someone has extra time to review something, what should be next? 
>>>>>>> That’s
>>>>>>> top priority.
>>>>>>>
>>>>>>> Here’s my rough categorization. If you disagree, please speak up:
>>>>>>>
>>>>>>>    - If you think that something should be top priority, what gets
>>>>>>>    moved to priority 2?
>>>>>>>    - Should the priority for a project in 2 or 3 change?
>>>>>>>    - Is the S/M/L size of a project wrong?
>>>>>>>
>>>>>>> Top priority, 1:
>>>>>>>
>>>>>>>    - API: Iceberg 1.0 [medium]
>>>>>>>    - Spark: Merge-on-read plans [large]
>>>>>>>    - Maintenance: Delete file compaction [medium]
>>>>>>>    -
>>>>>>>
>>>>>>>    Flink: Upgrade to 1.13.2 (document compatibility) [medium]
>>>>>>>    -
>>>>>>>
>>>>>>>    Python: Pythonic refactor [medium]
>>>>>>>
>>>>>>> Priority 2:
>>>>>>>
>>>>>>>    - ORC: Support delete files stored as ORC [small]
>>>>>>>    - Spark: DSv2 streaming improvements [small]
>>>>>>>    - Flink: Inline file compaction [small]
>>>>>>>    - Flink: Support UPSERT [small]
>>>>>>>    - Views: Spec [medium]
>>>>>>>    - Spec: Z-ordering / Space-filling curves [medium]
>>>>>>>    - Spec: Snapshot tagging and branching [small]
>>>>>>>    - Spec: Secondary indexes [large]
>>>>>>>    - Spec v3: Encryption [large]
>>>>>>>    -
>>>>>>>
>>>>>>>    Spec v3: Relative paths [large]
>>>>>>>    -
>>>>>>>
>>>>>>>    Spec v3: Default field values [medium]
>>>>>>>
>>>>>>> Priority 3:
>>>>>>>
>>>>>>>    - Docs: versioned docs [medium]
>>>>>>>    - IO: Support Aliyun OSS/DLF [medium]
>>>>>>>    - IO: Support Dell ECS [medium]
>>>>>>>
>>>>>>> External:
>>>>>>>
>>>>>>>    - Trino: Bucketed joins [small]
>>>>>>>    - Trino: Row-level delete support [medium]
>>>>>>>    - Trino: Merge-on-read plans [medium]
>>>>>>>    - Trino: Multi-catalog support [small]
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>> Tabular
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Tabular
>>>>
>>>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to