Hi Tanu/Balaji,

I have not really faced the issue mentioned here. AFAIK, the Date and Timestamp 
types should work fine. The Logical Date type is represented as INT in Avro, 
that is why you see the integer ingested there 
https://avro.apache.org/docs/current/spec.html#Date . But it should not have 
any impact on querying and spark should be able to determine the Date from that.

In addition to the information requested by Gary, can you possibly open a 
GitHub issue with details about the environment where you are running 
Hudi/Spark and also may be a small example that can reproduce this issue ?

Thanks,
Udit

On 7/21/20, 11:06 AM, "Gary Li" <[email protected]> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



    Hi tanu,

    This seems like a Spark-Parquet type conversion issue. I use timestamp type
    and don’t have any issue with it.

    Would you try the following and provide more context?
    - save your dataframe as parquet instead of Hudi to see if the issue still
    persists
    - try timestamp type.
    - are you querying mor table using Sparksql?

    Thanks,
    Gary

    On Tue, Jul 21, 2020 at 1:23 AM tanu dua <[email protected]> wrote:

    > Thanks and even I am struggling with all data types except String with 
same
    > decode exception. For eg for both double and int and I got the exception
    > and when I convert to string all works fine in spark sql.
    >
    > On Tue, 21 Jul 2020 at 1:38 PM, Balaji Varadarajan
    > <[email protected]> wrote:
    >
    > >
    > > Gary/Udit,
    > > As you are familiar with this part of it, Can you please answer this
    > > question ?
    > > Thanks,Balaji.V    On Monday, July 20, 2020, 08:18:16 AM PDT, tanu dua <
    > > [email protected]> wrote:
    > >
    > >  Hi Guys,
    > > May I know how do you guys handle date and time stamp in Hudi.
    > > When I set DataTypes as Date in StructType it’s getting ingested as int
    > but
    > > when I query using spark sql I get the following
    > >
    > > https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-17557
    > >
    > > So not sure if it’s only me who face this. Do I need to change to String
    > > ?
    >

Reply via email to