Hi Benj,

Creating partitions as in your first example won't work.
>From the docs: "During partitioning, Drill creates separate files, but not
separate directories, for different partitions." (
https://drill.apache.org/docs/how-to-partition-data/).
Also, Drill doesn't write additional metadata regarding partitioning, when
it reads parquet files it determines partitions using min/max values.
That means that if you want for example to partition using the first
letter, you'll need to create a corresponding column. Or you can create
partitions manually as directories.

On Wed, Dec 5, 2018 at 10:07 PM <[email protected]> wrote:

> In would like to create a parquet with a partition on computed data
> (without to have to put the result of the computation in the parquet) :
> The goal is to optimize the parquet for typical expecting queries.
>
> Imaginary example :
> CREATE TABLE `mytable`
> PARTITION BY (substr(name,1,1)) AS
> SELECT name, birthdate, birthcity
> ORDER BY bithdate;
>
> So, if I do that I obtain a VALIDATION ERROR: Partition column ... is not
> in the SELECT list of CTAS
>
> And the comment of the code of the function
> "public static RelNode qualifyPartitionCol(RelNode input, List<String>
> partitionColumns)"
> confirms that it's not possible actually :
> " A partition column is resolved, either (1) the same column appear in the
> select list of CTAS or (2) CTAS has a * in select list"
>
> But what is the reason of this limitation ?
> Is there exists any tricks to do it right now, or can we expect an
> evolution to allow this possibilities.
>
> I just imagine to do (with the data of the example)
> CREATE TABLE `mytable`
> PARTITION BY (sname) AS
> SELECT substr(name,1,1) sname, name, birthdate, birthcity
> ORDER BY bithdate;
> Then, next, request each partition file to remove the useless data
> , like
> CREATE TABLE `mytable_2/partition_x`
> SELECT name, birthdate, birthcity
> ORDER BY bithdate;
> but it's not really satisfying...
>
> I would appreciate yours comments,
> Regards,
>
> benj
>


-- 
Sincerely, Anton Gozhiy
[email protected]

Reply via email to