Impact on Spark-Iceberg usage on missing to enforce clustering/sort requirement (SPARK-23889)

2020-09-16 Thread Jungtaek Lim
Hi all, Recently I played around the partitioned Iceberg table in Spark, and realized it requires manual sort. I had to google to find a workaround - I guess there's no documentation unless I'm missing something. While I encountered this with a DataFrame writer, I suspect there would be more

Re: Travis build question

2020-09-16 Thread Ryan Blue
The problem is in a new test, so merging or rebasing probably won't help. I'll run the tests and see if I can reproduce the error in my environment. On Wed, Sep 16, 2020 at 11:26 AM Mass Dosage wrote: > This is what it's failing with right? > > org.apache.iceberg.hadoop.TestHadoopCatalog >

Re: Travis build question

2020-09-16 Thread Mass Dosage
This is what it's failing with right? org.apache.iceberg.hadoop.TestHadoopCatalog > testVersionHintFile FAILED org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist: tbl at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:108) at

Travis build question

2020-09-16 Thread Peter Vary
Hi Team, Struggling with a PR (https://github.com/apache/iceberg/pull/1465) where a test is green on my runs in IntelliJ, and also green if I run the test with command line, and I even run them successfully on linux with the command: ./gradlew :iceberg-core:test The problem is that the test

Use Iceberg for a time-series data lake

2020-09-16 Thread Yi Chen
Hi Iceberg Dev, We are looking into Iceberg for a data lake solution to replace a legacy system been there for many years. Our data(~10+PB in total) is time-series tabular data. We built a proof-of-concept earlier, which ended up with a very similar design like Iceberg, especially on the table

Re: How to time travel using Presto?

2020-09-16 Thread 李响
Thanks @torres, it works for me! On Wed, Sep 16, 2020 at 7:33 AM Gustavo Torres Torres wrote: > I believe you can see all snapshots available from a table by running: > > SELECT * > FROM my_db."my_table$snapshots" > > From there you can get all different snapshot_id available. Then you can >