from:"Zoltán Borók\-Nagy"

Re: Spark cannot read iceberg tables which were originally written by Impala

2024-01-03 Thread Zoltán Borók-Nagy

also produced correct Parquet files, but that's beyond our control and > > there's, no doubt, a ton of data already in that format. > > > > This could also be part of our v3 work, where I think we intend to add > > binary to string type promotion to the format. > > >

Re: Spark cannot read iceberg tables which were originally written by Impala

2023-12-26 Thread Zoltán Borók-Nagy

Hey Everyone, Thank you for raising this issue and reaching out to the Impala community. Let me clarify that the problem only happens when there is a legacy Hive table written by Impala, which is then converted to Iceberg. When Impala writes into an Iceberg table there is no problem with

Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-27 Thread Zoltán Borók-Nagy

avac is not an >>> optimizing compiler and there should not be much difference in performance >>> of the jars produced by different compilers, these changes might be worth >>> for the project to declare a newer compile-time JDK across all modules, and >>>

Re: Support create table like for Iceberg table?

2023-04-26 Thread Zoltán Borók-Nagy

As a reference, Impala can also do Hive-style CREATE TABLE x LIKE y for Iceberg tables. You can see various examples at https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create-table-like-table.test - Zoltan On Wed, Apr 26, 2023 at 4:10 AM

Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-24 Thread Zoltán Borók-Nagy

Besides Hive, neither Impala is compatible with Java11 right now. This work is in-progress: https://issues.apache.org/jira/browse/IMPALA-11360 - Zoltan On Mon, Apr 24, 2023 at 11:07 AM Mass Dosage wrote: > I agree with Ryan, unless you can change the source version there's not > that much

Re: C++/Rust SDK sync

2023-04-12 Thread Zoltán Borók-Nagy

Hi, I am also interested in the discussion, all those times work for me. Cheers, Zoltan On Wed, Apr 12, 2023 at 4:17 AM Chao Sun wrote: > We are also interested in this discussion. Internally, we have been > working on something similar in Rust, so it'd be great if we can > combine the

Re: Temporal Iceberg Service

2022-09-01 Thread Zoltán Borók-Nagy

Hi Taher, I think most of your questions are answered in the Scan Planning section at the Iceberg spec page: https://iceberg.apache.org/spec/#scan-planning To give you some specific answers as well: Equality Deletes: data and delete files have sequence numbers from which readers can infer the

Impala reading V2 tables design doc

2022-07-08 Thread Zoltán Borók-Nagy

Hi Iceberg/Impala Team, We've been working on adding read support for Iceberg V2 tables in Impala. In the first round we're focusing on position deletes. We are thinking about different approaches so I've written a design doc about it:

Re: Matching iceberg data types to Parquet data types

2021-08-27 Thread Zoltán Borók-Nagy

Hi, You can find information of type mappings here: https://iceberg.apache.org/spec/#parquet 1. Iceberg timestamps have microseconds precision. In Parquet they are stored as INT64s with TIMESTAMP_MICROS annotation. 2. Iceberg limits decimal precision to 38:

Re: question about the iceberg manifest/manifest list/metadata api

2021-06-08 Thread Zoltán Borók-Nagy

5ca1dc236a340a5d9d3031.jpg=%5B%22%E9%82%AE%E7%AE%B1yong.sunny%40163.com+from+phone%22%5D> > > 签名由网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88> 定制 > > On 05/27/2021 16:54, Zoltán Borók-Nagy wrote: > Hi Yong Yang, > > It is supported by Iceberg, and this is

Re: question about the iceberg manifest/manifest list/metadata api

2021-05-27 Thread Zoltán Borók-Nagy

Hi Yong Yang, It is supported by Iceberg, and this is exactly how Impala is working. I.e. Impala's Parquet writer writes the data files, then we use Iceberg's API to append them to the table. You can find the relevant code here:

Re: Dynamic INSERT OVERWRITE

2021-01-30 Thread Zoltán Borók-Nagy

ou want to overwrite a day, you pass a filter for that day. > Another way around this problem is to support MERGE INTO, which will detect > the files that need to be changed and correctly rewrite them, wherever they > are in the table. > > rb > > On Fri, Jan 29, 2021 at 10:14 AM

Dynamic INSERT OVERWRITE

2021-01-29 Thread Zoltán Borók-Nagy

Hey everyone, I'm currently working on the INSERT OVERWRITE statement for Iceberg tables in Impala. Seems like ReplacePartitions is the perfect interface for this job: https://github.infra.cloudera.com/CDH/iceberg/blob/cdpd-master/api/src/main/java/org/apache/iceberg/ReplacePartitions.java IIUC

Re: Welcoming Peter Vary as a new committer!

2021-01-26 Thread Zoltán Borók-Nagy

Congrats, Peter! On Tue, Jan 26, 2021 at 5:47 AM ForwardXu wrote: > Congratulations Peter! > > > -- 原始邮件 -- > *发件人:* "dev" ; > *发送时间:* 2021年1月26日(星期二) 凌晨4:25 > *收件人:* "dev"; > *主题:* Re: Welcoming Peter Vary as a new committer! > > Congratulations! > > Op ma 25

Re: Iceberg/Hive properties handling

2020-12-01 Thread Zoltán Borók-Nagy

pass table properties from Hive or Impala. If we exclude a prefix or >specific properties, then everything but the properties reserved for >locating the table are passed as the user would expect. > > I don't have a strong opinion about this, but yeah, maybe this behavior would cause t

Re: Iceberg/Hive properties handling

2020-11-30 Thread Zoltán Borók-Nagy

Thanks, Peter. I answered inline. On Mon, Nov 30, 2020 at 3:13 PM Peter Vary wrote: > Hi Zoltan, > > Answers below: > > On Nov 30, 2020, at 14:19, Zoltán Borók-Nagy < > borokna...@cloudera.com.INVALID> wrote: > > Hi, > > Thanks for the replies. My take fo

Re: Iceberg/Hive properties handling

2020-11-30 Thread Zoltán Borók-Nagy

" properties to SERDEPROPERTIES? >- Shall we define a prefix for setting Iceberg table properties from >Hive queries and omitting other engine specific properties? > > > Thanks, > Peter > > > On Nov 27, 2020, at 17:45, Mass Dosage wrote: > > I like

Re: Iceberg/Hive properties handling

2020-11-26 Thread Zoltán Borók-Nagy

Hi, The above aligns with what we did in Impala, i.e. we store information about table loading in HMS table properties. We are just a bit more explicit about which catalog to use. We have table property 'iceberg.catalog' to determine the catalog type, right now the supported values are

Re: Iceberg - Hive schema synchronization

2020-11-25 Thread Zoltán Borók-Nagy

Hi Everyone, In Impala we face the same challenges. I think a strict 1-to-1 type mapping would be beneficial because that way we could derive the Iceberg schema from the Hive schema, not just the other way around. So we could just naturally create Iceberg tables via DDL. We should use the same

INSERT to Iceberg tables from Impala

2020-09-11 Thread Zoltán Borók-Nagy

Hi, I'm willing to add INSERT support for Iceberg tables in Impala. For start I created the following design doc: https://docs.google.com/document/d/1_KL0YptDKwhiXvJyx4Vb-yZjggrPQAW2yjeGV4C0vMU/edit?usp=sharing All comments are welcome. Thanks, Zoltan

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: Spark cannot read iceberg tables which were originally written by Impala

Re: [DISCUSS] Switch to JDK 11 for releases?

Re: Support create table like for Iceberg table?

Re: [DISCUSS] Switch to JDK 11 for releases?

Re: C++/Rust SDK sync

Re: Temporal Iceberg Service

Impala reading V2 tables design doc

Re: Matching iceberg data types to Parquet data types

Re: question about the iceberg manifest/manifest list/metadata api

Re: question about the iceberg manifest/manifest list/metadata api

Re: Dynamic INSERT OVERWRITE

Dynamic INSERT OVERWRITE

Re: Welcoming Peter Vary as a new committer!

Re: Iceberg/Hive properties handling

Re: Iceberg/Hive properties handling

Re: Iceberg/Hive properties handling

Re: Iceberg/Hive properties handling

Re: Iceberg - Hive schema synchronization

INSERT to Iceberg tables from Impala

20 matches

Site Navigation

Mail list logo

Footer information