Hi Jim,

I think it is the last thing you mentioned that make that magic happen. I
did not realize that standard lexicographical string compare will give the
same result as comparing timestamps but just remembered that this is how I
catalog my personal pictures to keep them in order (year/date/time)

so if timestamp is stored in a string column and I compare to another
string in the same format, no conversion occurs but they are still being
compared accurately. I will do some more testing, thanks for your help!

On Fri, Jan 18, 2019 at 11:54 AM Jim Apple <[email protected]> wrote:

> Try looking at the "predicates: ' line in the scan on your EXPLAIN output.
>
> Check out "conversions" in
> http://impala.apache.org/docs/build/html/topics/impala_timestamp.html
>
> One final note: if ingest_dt_tm is in the same format as '2019-01-18
> 10:42:25.795', then standard lexicographical string compare will give the
> same result as comparing timestamps.
>
> On Fri, Jan 18, 2019 at 8:04 AM Boris Tyukin <[email protected]>
> wrote:
>
>> Hello and Happy Friday!
>>
>> We decided to store date/timestamps as a string due to all the issues we
>> had with time zones and the way we want them to work consistently across
>> Impala, Hive, Kudu and Spark.
>>
>> In a query like below, I thought I would have to convert strings with
>> timestamps to timestamp explicitly but looks like Impala does implicit
>> conversion and our users won't need to worry about doing it. Is my
>> assumption correct?
>>
>> SELECT something
>> FROM table1
>> WHERE
>>   ingest_dt_tm > now() - interval 2 hours
>>
>> Query above is a simple one and I understand how Impala would "know" to
>> do implicit convertion.
>>
>> But I was really surprised that query below also worked properly because
>> in that case I compare a string column with another string:
>>
>> SELECT something
>> FROM table1
>> WHERE
>>   ingest_dt_tm > '2019-01-18 10:42:25.795'
>>
>> I thought we would need to do CAST('2019-01-18 10:42:25.795' as
>> timestamp) or something like that.
>>
>> Also from a performance perspective, would the statement below be equal:
>>
>> SELECT something
>> FROM table1
>> WHERE
>>   ingest_dt_tm > '2019-01-18 10:42:25.795'
>>
>> SELECT something
>> FROM table1
>> WHERE
>>   cast(ingest_dt_tm as timestamp)> '2019-01-18 10:42:25.795'
>>
>> SELECT something
>> FROM table1
>> WHERE
>>   ingest_dt_tm > cast('2019-01-18 10:42:25.795' as timestamp)
>>
>>

Reply via email to