Moving forward with the timestamp proposal

2019-02-20 Thread Zoltan Ivanfi
Hi, Last december we shared a timestamp harmonization proposal with the Hive, Spark and Impala communities. This was followed by an extensive discussion in January that lead to various updates and improvements to the proposal, as well as the creation of a new document for

Adding more timestamp types to on-disk storage formats

2019-01-17 Thread Zoltan Ivanfi
Hi, One of the feedbacks I got for the SQL timestamp type harmonization proposal was that I should reach out the file format communities as well. For this purpose I created a separate document from their perspective and sent it to the Avro, ORC, Parquet, Arrow, Kudu and Iceberg developer lists.

Re: proposal for expanded & consistent timestamp types

2019-01-08 Thread Zoltan Ivanfi
orks" and "you shouldn't have to care about the persistence format *or > which app created the data* > > What does Arrow do in this world, incidentally? > > > On 2 Jan 2019, at 11:48, Steve Loughran wrote: > > > > On 17 Dec 2018, at 17:44, Zoltan Iv

Updated proposal: Consistent timestamp types in Hadoop SQL engines

2018-12-19 Thread Zoltan Ivanfi
Dear All, I would like to thank every reviewer of the consistent timestamps proposal[1] for their time and valuable comments. Based on your feedback, I have updated the proposal. The changes include clarifications, fixes and other improvements as summarized at the end of the document, in the

Timestamp interoperability design doc available for review

2017-08-16 Thread Zoltan Ivanfi
Dear Spark Community, Based on earlier feedback from the Spark community, we would like to suggest a short-term fix for the timestamp interoperability problem[1] between different SQL-on-Hadoop engines. I created a design document[2] and would like to ask you to review it and let me know of any

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-06-06 Thread Zoltan Ivanfi
aframe, what is the type of the ts column? > 3. if it's a valid dataframe, what are the semantics of the type of the ts > column? > > Suppose further that Spark SQL sets the timestamp_interp on hivelogs. Can > you answer the same three questions for each combination of > timest

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-06-02 Thread Zoltan Ivanfi
guration as people identify solid use >> cases. >> >> Cheers, >> >> Michael >> >> >> >> On May 30, 2017, at 7:41 AM, Zoltan Ivanfi <z...@cloudera.com> wrote: >> >> Hi, >> >> If I remember correctly, the TIMESTAMP

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-30 Thread Zoltan Ivanfi
es all the datetime conversions with timezone in mind (it doesn't >>>> ignore timezone if a timestamp string has timezone specified). The session >>>> local timezone change further pushes Spark to that direction, but the >>>> semantics has been with timezone befo

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-25 Thread Zoltan Ivanfi
introduce a new timestamp without timezone type, and > have a config flag to specify which one (with tz or without tz) is the > default behavior. > > > > On Wed, May 24, 2017 at 5:46 PM, Zoltan Ivanfi <z...@cloudera.com> wrote: > >> Hi, >> >> Sorry if you re

SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-24 Thread Zoltan Ivanfi
Hi, Sorry if you receive this mail twice, it seems that my first attempt did not make it to the list for some reason. I would like to start a discussion about SPARK-18350 before it gets released because it seems to be going in a different