Re: [DISCUSS] Switch to JDK 11 for releases?

2023-04-24 Thread Mass Dosage
I agree with Ryan, unless you can change the source version there's not that much point. On the Hive front, as you can see from that ticket it's been open for 4(!) years and hasn't received much action recently. I think it's one of the reasons AWS EMR still defaults to Java 8. It would be really g

Re: Cannot build iceberg locally

2021-05-06 Thread Mass Dosage
Hello Taher, Can you share a bit more of the error message you're seeing? Perhaps attach a longer portion of the log showing all the gradle(?) output? Where exactly is the problem occurring that you can't resolve classes in the relocated package? Thanks, Adrian On Thu, 6 May 2021 at 13:28, Tahe

Re: introductory Iceberg blog post

2021-01-28 Thread Mass Dosage
>> Thanks for sharing this, Adrian! >> >> On Thu, Jan 28, 2021 at 1:54 AM Mass Dosage wrote: >> >>> Hello all, >>> >>> As you may be aware Expedia Group helped contribute Hive read support to >>> Iceberg last year. We finally got around to publi

introductory Iceberg blog post

2021-01-28 Thread Mass Dosage
Hello all, As you may be aware Expedia Group helped contribute Hive read support to Iceberg last year. We finally got around to publishing a blog post about this which also includes an overview of Iceberg and why we think it's so useful. If you're interested you can read it here: https://medium.c

Re: Welcoming Peter Vary as a new committer!

2021-01-25 Thread Mass Dosage
Nice one, well done Peter! On Mon, 25 Jan 2021 at 19:46, Daniel Weeks wrote: > Congratulations, Peter! > > On Mon, Jan 25, 2021, 11:27 AM Jungtaek Lim > wrote: > >> Congratulations Peter! Well deserved! >> >> On Tue, Jan 26, 2021 at 3:40 AM Wing Yew Poon >> wrote: >> >>> Congratulations Peter!

Re: S3 strong read-after-write consistency

2020-12-14 Thread Mass Dosage
t; supposed to fix the issue (according to Ryan). Is it safe without S3FileIO >> to use Hive catalog + Hadoop API for S3 now? >> >> 2020년 12월 2일 (수) 오후 6:54, Vivekanand Vellanki 님이 작성: >> >>> Iceberg tables backed by HadoopTables and HadoopCatalog require an >

S3 strong read-after-write consistency

2020-12-02 Thread Mass Dosage
Hello all, Yesterday AWS announced that S3 now has strong read-after-write consistency: https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency https://aws.amazon.com/s3/consistency/ Does this mean that Iceberg tables backed by HadoopTables and HadoopCatalog can no

Re: Iceberg/Hive properties handling

2020-11-27 Thread Mass Dosage
I like these suggestions, comments inline below on the last round... On Thu, 26 Nov 2020 at 09:45, Zoltán Borók-Nagy wrote: > Hi, > > The above aligns with what we did in Impala, i.e. we store information > about table loading in HMS table properties. We are just a bit more > explicit about whic

Re: CI logging question

2020-11-23 Thread Mass Dosage
, if you find > flaky tests for the Hive/Tez related tests, please notify me, Laszlo Pinter > or Marton Bod. > > Thanks, > Peter > > > On Nov 19, 2020, at 16:02, Peter Vary wrote: > > Created the pull request for it: > https://github.com/apache/iceberg/pull/1789

Re: Proposal for additional fields in Iceberg manifest files

2020-11-20 Thread Mass Dosage
+1 - I also like the idea of having more data profiling info for the partition but worry about hostnames and IP addresses and maintaining those as things change, especially if you have hundreds of hosts, I'd rather leave that to the name node. On Fri, 20 Nov 2020 at 17:48, Ryan Blue wrote: > Tha

Re: CI logging question

2020-11-18 Thread Mass Dosage
I can definitely see how having more detailed logs could be useful so I like what you're suggesting. I guess another option could be to make this configurable so you can pass in an argument to turn on the "showStandardStreams", by default it's false but while you're debugging this issue it would be

Re: [VOTE] Release Apache Iceberg 0.10.0 RC5

2020-11-09 Thread Mass Dosage
+1 (non-binding) I tested the Hive read path in distributed mode for HadoopTables-backed Iceberg tables and it worked fine. On Sun, 8 Nov 2020 at 18:06, Anton Okolnychyi wrote: > Hi everyone, > > I propose the following RC to be released as official Apache Iceberg > 0.10.0 release. > > The comm

Re: [VOTE] Release Apache Iceberg 0.10.0 RC4

2020-11-05 Thread Mass Dosage
+1 non-binding on RC4. I tested out the Hive read path on a distributed cluster using HadoopTables. On Thu, 5 Nov 2020 at 04:46, Dongjoon Hyun wrote: > +1 for 0.10.0 RC4. > > Bests, > Dongjoon. > > On Wed, Nov 4, 2020 at 7:17 PM Jingsong Li wrote: > >> +1 >> >> 1. Download the source tarball, s

Re: [VOTE] Release Apache Iceberg 0.10.0 RC2

2020-11-02 Thread Mass Dosage
+1 (non-binding) I ran the RC against a set of integration tests I have for a subset of the Hive2 read functionality on a distributed cluster and it worked fine. On Mon, 2 Nov 2020 at 04:05, Simon Su wrote: > + 1 (non-binding) > 1. Build code pass all UTs. > 2. Test Flink iceberg sink failover,

Re: Travis build question

2020-09-16 Thread Mass Dosage
This is what it's failing with right? org.apache.iceberg.hadoop.TestHadoopCatalog > testVersionHintFile FAILED org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist: tbl at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:108) at

Re: Iceberg sync notes - 9 September 2020

2020-09-15 Thread Mass Dosage
I'm fine with not waiting for Hive projection. What is in master now is enough to do an end-to-end Hive read, I'd prefer to have that out there sooner so we can start trying it out as opposed to delaying this release for the projection. Thanks, Adrian On Mon, 14 Sep 2020 at 23:38, Ryan Blue wro

Re: Upgrade components to Hive 3 and Hadoop 3

2020-09-14 Thread Mass Dosage
+1 for doing this in a way that keeps Hive 2 support as that's still our primary Hive version in production and will be for quite some time. On Mon, 14 Sep 2020 at 09:00, Marton Bod wrote: > Hi Ryan, > > Thanks, I absolutely agree with you that we should keep support for Hive2 > as well. I've cr

Re: [DISCUSS] Rename iceberg-hive module?

2020-09-03 Thread Mass Dosage
he rename >> >> On Thu, Aug 20, 2020 at 7:22 AM Junjie Chen >> wrote: >> >>> +1 for `iceberg-hive-metastore`, also +1 to have a new module to contain >>> the `iceberg-mr`. >>> >>> On Thu, Aug 20, 2020 at 8:13 PM Saisai Shao >>

Re: Question about Iceberg release cadence

2020-08-27 Thread Mass Dosage
I'm all for a release. The only thing still required for basic Hive read support (other than documentation of course!) is producing a *single* jar that can be added to Hive's classpath, the PR for that is at https://github.com/apache/iceberg/pull/1267. Thanks, Adrian On Thu, 27 Aug 2020 at 01:26

Re: Hive Iceberg writes

2020-08-27 Thread Mass Dosage
We're definitely interested in this too but haven't started work on it yet. It has been discussed at our community syncs as something quite a few people are interested in so if nobody else responds a good starting point would probably be an early WIP PR that everyone can follow and contribute to.

Re: [DISCUSS] Rename iceberg-hive module?

2020-08-20 Thread Mass Dosage
+1 for `iceberg-hive-metastore` as I found this confusing when I first started working with the code. On Thu, 20 Aug 2020 at 03:27, Jungtaek Lim wrote: > +1 for `iceberg-hive-metastore` and also +1 for RD's proposal. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > > > On Thu, Aug 20, 2020 at 11:20

Re: [DISCUSS] July board report

2020-07-08 Thread Mass Dosage
LGTM! On Tue, 7 Jul 2020 at 21:27, Ryan Blue wrote: > Hi everyone, > > Here's my draft report for July. Feel free to comment and suggest updates > that I've missed. Thanks! > > rb > > ## Description: > Apache Iceberg is a table format for huge analytic datasets that is > designed > for high perf

Iceberg at Subsurface Conference

2020-07-08 Thread Mass Dosage
Hello all, You might be interested to know that myself and Christine Mathiesen will be presenting our work on adding Hive read support to Iceberg at the upcoming Subsurface Cloud Data Lake conference. The talk is entitled "Hiveberg: Integrating Apache Iceberg with the Hive Metastore". You can regi

Re: failing tests on master

2020-06-29 Thread Mass Dosage
Yes, I merged them into our branches this afternoon and can confirm that the tests now pass. Thanks! On Mon, 29 Jun 2020 at 19:24, Ryan Blue wrote: > I merged this over the weekend, so it should be fixed now. Did it work for > you? > > On Fri, Jun 26, 2020 at 11:48 AM Mass D

Re: failing tests on master

2020-06-26 Thread Mass Dosage
this in >> https://github.com/apache/iceberg/pull/1127 >> >> Cheers, >> >> On Fri, Jun 26, 2020 at 5:26 AM Mass Dosage wrote: >> >>> Hello all, >>> >>> For the past week or so I've noticed failing builds on a local checkout >

failing tests on master

2020-06-26 Thread Mass Dosage
Hello all, For the past week or so I've noticed failing builds on a local checkout of master. I have raised an issue here: https://github.com/apache/iceberg/issues/1113 (there was initially one failing test, there are now two) Someone else raised a similar issue with one of the same failing te

Re: Iceberg community sync this week

2020-06-17 Thread Mass Dosage
I can't do Friday at 9:00 PDT but could do 11 or 12 PDT. On Wed, 17 Jun 2020 at 01:30, Ryan Blue wrote: > Sounds like we should not plan to move the sync tomorrow, and should set > up another discussion about getting the Hive work in and deduplicating > effort. > > When is a time that would work

Re: CI for Iceberg

2020-06-05 Thread Mass Dosage
ent pull requests > will fail. > > I think what you ran into here was an issue with a new test that pushed > down timestamp filters and an ORC timestamp correctness bug in stats. That > should be fixed now. The current master builds are green. > > On Fri, Jun 5, 2020 at 5:38 AM

Re: CI for Iceberg

2020-06-05 Thread Mass Dosage
I now looked better and see that the Travis file does actually build Iceberg ;) I'm still curious how something managed to get merged into master while failing the tests though? On Fri, 5 Jun 2020 at 13:13, Mass Dosage wrote: > Hello all, > > I just wanted to know if there is any

CI for Iceberg

2020-06-05 Thread Mass Dosage
Hello all, I just wanted to know if there is any CI set up for Iceberg? I noticed that if I pull the current master branch I get failing tests (see below for stack traces, Ryan - we talked about this last night but it's still happening). So this made me wonder why there isn't some CI set up to che

Re: [VOTE] Graduate to a top-level project

2020-05-15 Thread Mass Dosage
+1 as a member of the community (non-binding) On Thu, 14 May 2020 at 23:12, Gautam wrote: > +1 We'v come a long way :-) > > On Wed, May 13, 2020 at 1:07 AM Dongjoon Hyun > wrote: > >> +1 for graduation! >> >> Bests, >> Dongjoon. >> >> On Tue, May 12, 2020 at 11:59 PM Driesprong, Fokko >> wrot

Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC2

2020-04-30 Thread Mass Dosage
The build for RC2 worked fine for me, I didn't get a failure on "TestHiveTableConcurrency". Perhaps there is some kind of race condition in the test? I have seen timeout errors like that when I ran tests on an overloaded machine, could that have been the case? On Thu, 30 Apr 2020 at 08:32, OpenInx

Re: [VOTE] Release Apache Iceberg 0.8.0-incubating RC1

2020-04-29 Thread Mass Dosage
+1 (non-binding) [I assume only Apache/Iceberg members have binding votes?) Similar to others I verified: √ RAT checks passed √ signature is correct √ checksum is correct √ build from source √ run tests locally Thanks, Adrian On Tue, 28 Apr 2020 at 21:45, Ryan Blue wrote: > Here are the step

Re: Iceberg community sync notes - 15 April 2020

2020-04-17 Thread Mass Dosage
od to get the > community's feedback on how to proceed. > > -best, > R. > > On Fri, Apr 17, 2020 at 6:28 AM Mass Dosage wrote: > >> Thanks for the detailed notes Ryan. My thoughts on a few of the topics... >> >> 0.8.0 release - my general preference is t

Re: Iceberg community sync notes - 15 April 2020

2020-04-17 Thread Mass Dosage
In the sync, we thought that it would be good to wait and get these in. >> Please reply to this if you agree or disagree. >> >> Thanks! >> >> *Attendees*: >> >>- Ryan Blue >>- Dan Weeks >>- Anjali Norwood >>- Jun Ma &g

Re: [Discuss] Merge spark-3 branch into master

2020-03-26 Thread Mass Dosage
s://github.com/palantir/gradle-consistent-versions> >>>>>>>> takes. Gradle Consistent Versions is specifically opinionated towards >>>>>>>> building against one version of a library across all modules in the >>>>>>>> build. &g

Re: Shall we start a regular community sync up?

2020-03-18 Thread Mass Dosage
something in the evening here in > California, right? > > On Wed, Mar 18, 2020 at 10:06 AM Mass Dosage wrote: > >> +1 to monthly or fortnightly. >> >> On Wed, 18 Mar 2020 at 16:22, Miao Wang wrote: >> >>> +1. Monthly or Bi-Weekly. >>> >>&g

Re: Shall we start a regular community sync up?

2020-03-18 Thread Mass Dosage
+1 to monthly or fortnightly. On Wed, 18 Mar 2020 at 16:22, Miao Wang wrote: > +1. Monthly or Bi-Weekly. > > > > *From: *OpenInx > *Reply-To: *"dev@iceberg.apache.org" > *Date: *Wednesday, March 18, 2020 at 8:20 AM > *To: *"dev@iceberg.apache.org" > *Cc: *Ryan Blue > *Subject: *Re: Shall we

Re: [Discuss] Merge spark-3 branch into master

2020-03-03 Thread Mass Dosage
+1 for a 0.8.0 release with Spark 2.4 and then move on for Spark 3.0 when it's ready. On Tue, 3 Mar 2020 at 16:32, Ryan Blue wrote: > Thanks for bringing this up, Saisai. I tried to do this a couple of months > ago, but ran into a problem with dependency locks. I couldn't get two > different ver

Re: Hive Metastore integration future

2020-02-19 Thread Mass Dosage
er away for Spark as >>> it is not allowing interaction with Hive intrinsic services like the >>> metastore anyway. It might be that you can run the Hive 3 metastore for >>> now but the paths forward don't suggest that is accessible for much into >>> th

Re: Hive Metastore integration future

2020-01-29 Thread Mass Dosage
On the topic of Hive versions - we've definitely experienced some issues trying to programmatically use the iceberg-spark-runtime artifact in unit tests (it uses Hive 1.2 as mentioned above). We then tried to also use some other common HIve testing libraries like HiveRunner