Gidon, I think that the v3 part of encryption is actually documenting how
it works and adding it to the spec. Right now we have hooks for building
some encryption around it, but almost no requirements in the spec for how
to use it across implementations. This is fine while we're working on
defining encryption, but we eventually want to update the spec.

Jack, I'm happy to add the external PrestoDB items to the roadmap. I'm just
not quite sure what to do here since we aren't tracking them in the Iceberg
community ourselves. I listed those as external so that we can publish
links to where those are tracked in other communities. We can add as many
of these as we want.

Anton, I agree. The goal here is to identify the top priority items to help
direct review effort. We want everything to continue progressing, but I
think it's good to identify where we as a community want to focus review
time.

Sounds like one area of uncertainty is FLIP-27 vs Flink 1.13.2. Can someone
summarize the status of Flink and what we need? I don't think I understand
it well enough to suggest which one takes priority.

Ryan

On Mon, Sep 13, 2021 at 7:54 PM Anton Okolnychyi
<aokolnyc...@apple.com.invalid> wrote:

> The discussed roadmap makes sense to me. I think it is important to agree
> on what we should do first as the review pool is limited. There are more
> and more large items that are half done or half discussed. I think we
> better focus on finishing them quickly and then move to something else as
> opposed to making very minor progress on a number of issues.
>
> To be clear, it is not like other things are not important or we should
> stop their development. It is more about making sure certain high-priority
> features for most folks in the community get enough attention.
>
> - Anton
>
> On 13 Sep 2021, at 12:19, Jack Ye <yezhao...@gmail.com> wrote:
>
> I'd like to also propose adding the following in the external section:
> 1. the PrestoDB equivalent for each item listed for Trino. I am not sure
> what's the best way to track them, but I feel it's better to list and track
> them separately. I have talked with related people currently maintaining
> the PrestoDB Iceberg connector (mostly in Twitter), and they would like to
> take a different route from Trino to fully remove Hive dependencies in the
> connector. This means the 2 connectors will likely diverge in
> implementation in the near future.
> 2. adding a medium item for Trino and PrestoDB Avro support
> 3. adding a small item for Trino and PrestoDB full system table support
> (the system table schema in them are diverging from core, and missing a few
> latest system tables)
>
> For the items listed with "Spec" and "Spec v3", what are the key
> differences? I thought we are treating any new spec changes after the
> format v2 vote as v3.
>
> Best,
> Jack Ye
>
> On Mon, Sep 13, 2021 at 7:13 AM Gidon Gershinsky <gg5...@gmail.com> wrote:
>
>> Hi Ryan,
>>
>> I just wonder if the encryption should be a Spec v3 category. We have the
>> key_metadata fields in both data_file and manifest_file structs, which
>> might be sufficient for a reasonable basic encryption support.
>> But I certainly agree this is an L-sized project.
>>
>> Cheers, Gidon
>>
>>
>> On Sat, Sep 11, 2021 at 12:38 AM Ryan Blue <b...@tabular.io> wrote:
>>
>>> Hi everyone,
>>>
>>> At the last sync meeting, we brought up publishing a community roadmap
>>> and brainstormed the many features and initiatives that the community is
>>> working on. In this thread, I want to make sure that we have a good list of
>>> what people are thinking about and I think we should try to categorize the
>>> projects by size and general priority. When we reach a rough agreement,
>>> I’ll write this up and post it on the ASF site along with links to some
>>> projects in Github.
>>>
>>> My rationale for attempting to prioritize projects is that if we try to
>>> do too many things, it will be slower progress across everything rather
>>> than getting a few important items done. I know that priorities don’t align
>>> very cleanly in practice, but it is hopefully worth trying. To come up with
>>> a priority, I’m trying to keep top priority items to a minimum by including
>>> only one from each group (Spark, Flink, Python, etc.). The remaining items
>>> are split between priority 2 and 3. Priority 3 is not urgent, including
>>> things that can be plugged in (like other IO libraries), docs, etc.
>>> Everything else is priority 2.
>>>
>>> That something isn’t priority 1 doesn’t mean it isn’t important or
>>> progressing, just that it isn’t the current focus. I think of it this way:
>>> if someone has extra time to review something, what should be next? That’s
>>> top priority.
>>>
>>> Here’s my rough categorization. If you disagree, please speak up:
>>>
>>>    - If you think that something should be top priority, what gets
>>>    moved to priority 2?
>>>    - Should the priority for a project in 2 or 3 change?
>>>    - Is the S/M/L size of a project wrong?
>>>
>>> Top priority, 1:
>>>
>>>    - API: Iceberg 1.0 [medium]
>>>    - Spark: Merge-on-read plans [large]
>>>    - Maintenance: Delete file compaction [medium]
>>>    - Flink: Upgrade to 1.13.2 (document compatibility) [medium]
>>>    - Python: Pythonic refactor [medium]
>>>
>>> Priority 2:
>>>
>>>    - ORC: Support delete files stored as ORC [small]
>>>    - Spark: DSv2 streaming improvements [small]
>>>    - Flink: Inline file compaction [small]
>>>    - Flink: Support UPSERT [small]
>>>    - Views: Spec [medium]
>>>    - Spec: Z-ordering / Space-filling curves [medium]
>>>    - Spec: Snapshot tagging and branching [small]
>>>    - Spec: Secondary indexes [large]
>>>    - Spec v3: Encryption [large]
>>>    - Spec v3: Relative paths [large]
>>>    - Spec v3: Default field values [medium]
>>>
>>> Priority 3:
>>>
>>>    - Docs: versioned docs [medium]
>>>    - IO: Support Aliyun OSS/DLF [medium]
>>>    - IO: Support Dell ECS [medium]
>>>
>>> External:
>>>
>>>    - Trino: Bucketed joins [small]
>>>    - Trino: Row-level delete support [medium]
>>>    - Trino: Merge-on-read plans [medium]
>>>    - Trino: Multi-catalog support [small]
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>
>

-- 
Ryan Blue
Tabular

Reply via email to