Re: [DISCUSS] Variant Spec Location

2024-08-15 Thread Jingsong Li
Thanks all for your discussion. The Apache Paimon community is also considering support for this Variant type, without a doubt, we hope to maintain consistency with Iceberg. Not only the Paimon community, but also various computing engines need to adapt to this type, such as Flink and StarRocks.

Re: Call for Ryan Blue to Step Down as PMC Chair

2024-06-04 Thread Jingsong Li
Hi, +1 to Jean-Baptiste. I am not a PMC member, but what I see in the iceberg community is Ryan's dedication to making this community better. He has invested a lot of energy in the community. I have also learned a lot from him. I personally trust Ryan, and I don't think he would do anything that

Re: [VOTE] Release Apache Iceberg 0.10.0 RC4

2020-11-04 Thread Jingsong Li
+1 1. Download the source tarball, signature (.asc), and checksum (.sha512): OK 2. Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS (optional if this hasn’t changed) : OK 3. Verify the signature by running: gpg --verify apache-iceberg-xx.tar.gz.asc: OK 4. Verify the

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-02 Thread Jingsong Li
+1 1. Download the source tarball, signature (.asc), and checksum (.sha512): OK 2. Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS (optional if this hasn’t changed) : OK 3. Verify the signature by running: gpg --verify apache-iceberg-xx-incubating.tar.gz.asc: OK 4.

Re: Subscribe to the iceberg dev mail list

2020-09-23 Thread Jingsong Li
+ @Kun Liu Best, Jingsong On Thu, Sep 24, 2020 at 11:33 AM Junjie Chen wrote: > Hi Kun > > I think you want to send mail to dev-subscr...@iceberg.apache.org. > > On Thu, Sep 24, 2020 at 11:25 AM Kun Liu wrote: > >> Hi, >> >> subscribe to the dev mail list. >> >> Best, >> Kun Liu >> > > > -- >

Re: [DISCUSS] September board report

2020-09-08 Thread Jingsong Li
+1 Thanks Ryan for reporting. On Wed, Sep 9, 2020 at 3:59 AM Ryan Blue wrote: > Hi everyone, > > It’s time for our board report, which I think is the last monthly report. > Here’s what I have so far. Please comment and reply with anything that I’ve > missed! > > rb > Description: > > Apache Ice

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Jingsong Li
+1 for timestamps are linear, in implementation, maybe the writer only needs to look at the previous snapshot timestamp. We're trying to think of iceberg as a message queue, Let's take the popular queue Kafka as an example, Iceberg has snapshotId and timestamp, corresponding, Kafka has offset and

Re: [DISCUSS] Rename iceberg-hive module?

2020-08-19 Thread Jingsong Li
+1 for `iceberg-hive-metastore` I'm confused about `iceberg-hive` and `iceberg-mr`. Best, Jingsong On Thu, Aug 20, 2020 at 9:48 AM Dongjoon Hyun wrote: > +1 for `iceberg-hive-metastore`. > > Maybe, is `Apache Iceberg 1.0.0` a good candidate to have that breaking > change? > > Bests, > Dongjoon

Re: Effect of enabling 'write.metadata.delete-after-commit.enabled'

2020-07-28 Thread Jingsong Li
Thanks Jungtaek for starting this discussion. What our team wants to do is data ingest into the Iceberg table with one minute frequency. This frequency can also lead to a large number of small files. Auto compaction(rewrites manifest and data files) in the streaming sink(writer) looks wonderful.

Re: New committer: Shardul Mahadik

2020-07-22 Thread Jingsong Li
Congratulations Shardul! Well deserved! Best, Jingsong On Thu, Jul 23, 2020 at 7:27 AM Anton Okolnychyi wrote: > Congrats and welcome! Keep up the good work! > > - Anton > > On 22 Jul 2020, at 16:02, RD wrote: > > Congratulations Shardul! Well deserved! > > -Best, > R. > > On Wed, Jul 22, 2020

Re: question about reader task planning using SupportsReportStatistics

2020-07-17 Thread Jingsong Li
> scan); >>> } >>> } >>> >>> return tasks; >>> } >>> >>> >>> On Fri, Jul 17, 2020 at 9:35 AM Sud wrote: >>> >>>> Thanks @Jingsong for reply >>>> >>>> Yes one additio

Re: question about reader task planning & BinPacking

2020-07-17 Thread Jingsong Li
Hi Sud, The batch read of the Iceberg table should just read the latest snapshot. I think this case is that your large tables have a large number of manifest files. 1.The simple way is reducing manifest file numbers: - For reducing manifest file number, you can try `Actions.rewriteManifests`(Than

Re: Iceberg with high frequency data!

2020-07-15 Thread Jingsong Li
Hi Ashish, Here is my thinking: IIUC, Spark Writer (Record writer) also buffer files as Iceberg dataFiles, for every micro-batch, Spark: - Closes DataFiles in the end (One task one file at least if task has records) - Collect them into Driver side, do a snapshot commit. So, you can choose the tr

Re: [VOTE] Release Apache Iceberg 0.9.0 RC5

2020-07-13 Thread Jingsong Li
+1 (non-binding) - verified signature and checksum - built from source and run tests - Validated Spark3: Used Ryan's example command, played with Spark3, looks very good. - Validated vectorized reads: open vectorization-enabled, works well. Best, Jingsong On Mon, Jul 13, 2020 at 2:37 PM Gautam

The relationship between issues and pull requests

2020-07-01 Thread Jingsong Li
Hi, When I look at the issues, it is difficult for me to determine which ones are already in progress and how they were fixed. And for understanding an issue, it seems that I need to read the discussion in issue, then search and find the merged pull request. It is not convenient. Can community ma