timestamp type with only microsecond precision

2021-03-01 Thread Steven Wu
Right now, Iceberg timestamp type only supports microseconds precision. Support for other precision like milli-seconds can be useful, as it is pretty commonly used. If we want to use the hidden partitioning (date or hour) on a timestamp field with milli-seconds precision, now we have to convert

Airflow Integration

2021-03-01 Thread Gustavo Torres Torres
Hey folks, Lately I've been thinking about integration between Airflow & Iceberg for a smooth transition from Hive-based tables to Iceberg ones and would like to hear about your experience. Specifically about Iceberg partition sensors in Airflow. >From the way I see it, there are two ways to go

Re: Default TimeZone for unit tests

2021-03-01 Thread Owen O'Malley
In ORC, the timezone tests vary the default timezone through multiple values using the Java APIs. (They do restore the initial value when the test exits.) :) .. Owen On Mon, Mar 1, 2021 at 9:25 PM Edgar Rodriguez wrote: > Hi folks, > > Thanks Peter for the quick fix! > > I do think it'd be a

Re: Default TimeZone for unit tests

2021-03-01 Thread Edgar Rodriguez
Hi folks, Thanks Peter for the quick fix! I do think it'd be a good idea to have this kind of coverage to some extent. Usually, a workflow some users follow is to only run locally the modules that they modify and rely on the CI to run the full check which takes longer, which makes room for these

Re: Default TimeZone for unit tests

2021-03-01 Thread Ryan Blue
I'm not sure it would be worth separating out the timezone tests to do this. I think we catch these problems pretty quickly with the number of users building in different zones. Is this something we want to spend time on? On Mon, Mar 1, 2021 at 10:29 AM Russell Spitzer wrote: > In the Spark

Re: Default TimeZone for unit tests

2021-03-01 Thread Russell Spitzer
In the Spark Cassandra Connector we had a similar issue, we would specifically spawn test JVM's with different default local time zones to make sure we handled these use cases, I also would make our test dates ones on gregorian calendar boundaries so being an hour off with result in a timestamp

Default TimeZone for unit tests

2021-03-01 Thread Peter Vary
Hi Team, Last weekend I caused a little bit of stir by pushing changes which had a green run on CI, but was failing locally if the default TZ was different than UTC. Do we want to set the TZ of the CI tests to some random non-UTC TZ to catch these errors? Pros: We can catch tests which are

RE: Reading data from Iceberg table into Apache Arrow in Java

2021-03-01 Thread Mayur Srivastava
Hi Ryan, I’ve submitted a pr (https://github.com/apache/iceberg/pull/2286) for the vectorized arrow reader: This is my first Iceberg pull request - I'm not fully aware of the contributing conventions of this repo, so let me know if any changes are needed in the pr. I've refactored some code