Also, I forgot to mention, I'm using Hive v3.1.2.

On 2022/05/16 03:09:19 Julien Phalip wrote:
> Hi,
>
> I've noticed an odd behavior with the 'hive.io.file.readcolumn.names' conf
> property.
>
> Imagine a simple table "mytable" with two fields: "text" and "number".
>
> - If you run the query "SELECT * FROM mytable", then the
> "hive.io.file.readcolumn.names" has the value: "text,number". Makes sense
> so far.
> - If you run the query "SELECT text FROM mytable", then the
> "hive.io.file.readcolumn.names" has the value: "text". Still makes sense.
>
> However, if you add a predicate (WHERE clause), then the behavior of that
> property seems strange to me:
>
> - If you run the query "SELECT * FROM mytable WHERE number = 999", then
the
> "hive.io.file.readcolumn.names" has the value: "text". The "number" column
> is missing from the property.
> - If you run the query "SELECT number FROM mytable WHERE number = 999",
> then the "hive.io.file.readcolumn.names" has the value: "" (empty string).
> The "number" column is still missing from the property.
>
> In other terms, it looks like if a column is part of a predicate, then it
> is omitted from the "hive.io.file.readcolumn.names" property. Do you know
> why that is?
>
> I'm writing a custom StorageHandler and so I would need to know exactly
> what columns the user is requesting. Is there a way to consistently
> retrieve all the requested columns either from the configuration or from
> within the InputFormat class, even when there is a WHERE clause?
>
> Thanks,
>
> Julien
>

Reply via email to