Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
DRILL-4510 is open for the error message that you see. Thanks, Khurram From: Divya Gehlot <divya.htco...@gmail.com> Sent: Tuesday, August 15, 2017 7:41:39 AM To: user@drill.apache.org Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes Hi , As advised I set the config parameter alter system set `drill.exec.functions.cast_empty_string_to_null` = true; When I try to ran the query on `Capacity` column with aggregated values it threw below error to me : s:[ SYSTEM ERROR: IllegalStateException: Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.drill.exec.vector.NullableVarCharVector, field= Capacity Thanks, Divya On 14 August 2017 at 17:56, Khurram Faraaz <kfar...@mapr.com> wrote: > You should look at DRILL-3214 which is Open. And DRILL-4503 is open for > Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes > > > And you can try setting this config parameter to see if it helps in your > case. > > > alter system set `drill.exec.functions.cast_empty_string_to_null` = true; > > > Thanks, > > Khurram > > > From: Divya Gehlot <divya.htco...@gmail.com> > Sent: Monday, August 14, 2017 3:06:57 PM > To: user@drill.apache.org > Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > schema changes > > forgot to mention I tried this first but some post in Apache Drill JIRA > mentioned to use 0 instead of null to get rid of error . > unfortunately it did not work in my case :( > > > Thanks, > Divya > > On 14 August 2017 at 14:56, Khurram Faraaz <kfar...@mapr.com> wrote: > > > Can you please try this, since you mention column has empty values try > > using null instead of zero, because 0 (zero) is not null. > > > > > > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` > > AS integer) > > > > END AS `Capacity` > > > > > > Thanks, > > > > Khurram > > > > > > From: Divya Gehlot <divya.htco...@gmail.com> > > Sent: Monday, August 14, 2017 12:14:26 PM > > To: user@drill.apache.org > > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > > schema changes > > > > Hi, > > I am getting below error : > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes Fragment 0:0 > > > > > > *Some background to understand the error scenario : * > > > > I have data saved in parquet files in multidirectory structure. > > One of the column of the table is empty values so I used below to handle > > null values > > > > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS > > INT) > > > END AS `Capacity` > > > > > > My query does some aggregation like sum of the column values .My query > > looks like below : > > SELECT > > `COUNTRY`, > > `YEAR`, > > `MONTH`, > > `DAY`, > > `Capacity`, > > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > > FROM `dfs`.`default`.`./parquetfiles/countries/` > > WHERE `YEAR` =2017 AND `MONTH` = 2 > > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > > > When I include `Capacity` column in above query it gives me below error : > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes > > > > > > When I exclude the `Capacity` column the query runs fine and I could view > > the expected results . > > > > Few approaches/verification which I tried after browsing the internet > > regarding the error whic i encounter : > > 1 : alter session set `planner.enable_streamagg` = false; > > alter session set `planner.enable_hashjoin` = false; > > > > Still the same error > > > > > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes Fragment > > > > > > 2 : somebody suggested that if some of your files are empty you will > > encounter the error but I checked and confirm that none of my parquet > file > > is empty. > > > > 3.I have same schema across all the files . > > > > 4. If I query the files individually there is no error and It works fine > . > > For instance if my query is > > SELECT > > `COUNTRY`, > > `YEAR`, > > `MONTH`, > > `DAY`, > > `Capacity`, > > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet` > > WHERE `YEAR` =2017 AND `MONTH` = 2 > > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > > > Above query works fine for all the countries which I have under countries > > directory. > > But if I query until directory countries its throws error . > > > > Appreciate the help! > > > > Thanks, > > Divya > > >
Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
Hi , As advised I set the config parameter alter system set `drill.exec.functions.cast_empty_string_to_null` = true; When I try to ran the query on `Capacity` column with aggregated values it threw below error to me : s:[ SYSTEM ERROR: IllegalStateException: Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.drill.exec.vector.NullableVarCharVector, field= Capacity Thanks, Divya On 14 August 2017 at 17:56, Khurram Faraaz <kfar...@mapr.com> wrote: > You should look at DRILL-3214 which is Open. And DRILL-4503 is open for > Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes > > > And you can try setting this config parameter to see if it helps in your > case. > > > alter system set `drill.exec.functions.cast_empty_string_to_null` = true; > > > Thanks, > > Khurram > > > From: Divya Gehlot <divya.htco...@gmail.com> > Sent: Monday, August 14, 2017 3:06:57 PM > To: user@drill.apache.org > Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > schema changes > > forgot to mention I tried this first but some post in Apache Drill JIRA > mentioned to use 0 instead of null to get rid of error . > unfortunately it did not work in my case :( > > > Thanks, > Divya > > On 14 August 2017 at 14:56, Khurram Faraaz <kfar...@mapr.com> wrote: > > > Can you please try this, since you mention column has empty values try > > using null instead of zero, because 0 (zero) is not null. > > > > > > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` > > AS integer) > > > > END AS `Capacity` > > > > > > Thanks, > > > > Khurram > > > > > > From: Divya Gehlot <divya.htco...@gmail.com> > > Sent: Monday, August 14, 2017 12:14:26 PM > > To: user@drill.apache.org > > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > > schema changes > > > > Hi, > > I am getting below error : > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes Fragment 0:0 > > > > > > *Some background to understand the error scenario : * > > > > I have data saved in parquet files in multidirectory structure. > > One of the column of the table is empty values so I used below to handle > > null values > > > > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS > > INT) > > > END AS `Capacity` > > > > > > My query does some aggregation like sum of the column values .My query > > looks like below : > > SELECT > > `COUNTRY`, > > `YEAR`, > > `MONTH`, > > `DAY`, > > `Capacity`, > > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > > FROM `dfs`.`default`.`./parquetfiles/countries/` > > WHERE `YEAR` =2017 AND `MONTH` = 2 > > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > > > When I include `Capacity` column in above query it gives me below error : > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes > > > > > > When I exclude the `Capacity` column the query runs fine and I could view > > the expected results . > > > > Few approaches/verification which I tried after browsing the internet > > regarding the error whic i encounter : > > 1 : alter session set `planner.enable_streamagg` = false; > > alter session set `planner.enable_hashjoin` = false; > > > > Still the same error > > > > > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > > changes Fragment > > > > > > 2 : somebody suggested that if some of your files are empty you will > > encounter the error but I checked and confirm that none of my parquet > file > > is empty. > > > > 3.I have same schema across all the files . > > > > 4. If I query the files individually there is no error and It works fine > . > > For instance if my query is > > SELECT > > `COUNTRY`, > > `YEAR`, > > `MONTH`, > > `DAY`, > > `Capacity`, > > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet` > > WHERE `YEAR` =2017 AND `MONTH` = 2 > > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > > > Above query works fine for all the countries which I have under countries > > directory. > > But if I query until directory countries its throws error . > > > > Appreciate the help! > > > > Thanks, > > Divya > > >
Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
You should look at DRILL-3214 which is Open. And DRILL-4503 is open for Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes And you can try setting this config parameter to see if it helps in your case. alter system set `drill.exec.functions.cast_empty_string_to_null` = true; Thanks, Khurram From: Divya Gehlot <divya.htco...@gmail.com> Sent: Monday, August 14, 2017 3:06:57 PM To: user@drill.apache.org Subject: Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes forgot to mention I tried this first but some post in Apache Drill JIRA mentioned to use 0 instead of null to get rid of error . unfortunately it did not work in my case :( Thanks, Divya On 14 August 2017 at 14:56, Khurram Faraaz <kfar...@mapr.com> wrote: > Can you please try this, since you mention column has empty values try > using null instead of zero, because 0 (zero) is not null. > > > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` > AS integer) > > END AS `Capacity` > > > Thanks, > > Khurram > > > From: Divya Gehlot <divya.htco...@gmail.com> > Sent: Monday, August 14, 2017 12:14:26 PM > To: user@drill.apache.org > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > schema changes > > Hi, > I am getting below error : > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes Fragment 0:0 > > > *Some background to understand the error scenario : * > > I have data saved in parquet files in multidirectory structure. > One of the column of the table is empty values so I used below to handle > null values > > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS > INT) > > END AS `Capacity` > > > My query does some aggregation like sum of the column values .My query > looks like below : > SELECT > `COUNTRY`, > `YEAR`, > `MONTH`, > `DAY`, > `Capacity`, > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > FROM `dfs`.`default`.`./parquetfiles/countries/` > WHERE `YEAR` =2017 AND `MONTH` = 2 > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > When I include `Capacity` column in above query it gives me below error : > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes > > > When I exclude the `Capacity` column the query runs fine and I could view > the expected results . > > Few approaches/verification which I tried after browsing the internet > regarding the error whic i encounter : > 1 : alter session set `planner.enable_streamagg` = false; > alter session set `planner.enable_hashjoin` = false; > > Still the same error > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes Fragment > > > 2 : somebody suggested that if some of your files are empty you will > encounter the error but I checked and confirm that none of my parquet file > is empty. > > 3.I have same schema across all the files . > > 4. If I query the files individually there is no error and It works fine . > For instance if my query is > SELECT > `COUNTRY`, > `YEAR`, > `MONTH`, > `DAY`, > `Capacity`, > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet` > WHERE `YEAR` =2017 AND `MONTH` = 2 > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > Above query works fine for all the countries which I have under countries > directory. > But if I query until directory countries its throws error . > > Appreciate the help! > > Thanks, > Divya >
Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
forgot to mention I tried this first but some post in Apache Drill JIRA mentioned to use 0 instead of null to get rid of error . unfortunately it did not work in my case :( Thanks, Divya On 14 August 2017 at 14:56, Khurram Faraazwrote: > Can you please try this, since you mention column has empty values try > using null instead of zero, because 0 (zero) is not null. > > > CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` > AS integer) > > END AS `Capacity` > > > Thanks, > > Khurram > > > From: Divya Gehlot > Sent: Monday, August 14, 2017 12:14:26 PM > To: user@drill.apache.org > Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support > schema changes > > Hi, > I am getting below error : > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes Fragment 0:0 > > > *Some background to understand the error scenario : * > > I have data saved in parquet files in multidirectory structure. > One of the column of the table is empty values so I used below to handle > null values > > > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS > INT) > > END AS `Capacity` > > > My query does some aggregation like sum of the column values .My query > looks like below : > SELECT > `COUNTRY`, > `YEAR`, > `MONTH`, > `DAY`, > `Capacity`, > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > FROM `dfs`.`default`.`./parquetfiles/countries/` > WHERE `YEAR` =2017 AND `MONTH` = 2 > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > When I include `Capacity` column in above query it gives me below error : > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes > > > When I exclude the `Capacity` column the query runs fine and I could view > the expected results . > > Few approaches/verification which I tried after browsing the internet > regarding the error whic i encounter : > 1 : alter session set `planner.enable_streamagg` = false; > alter session set `planner.enable_hashjoin` = false; > > Still the same error > > > > > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > > changes Fragment > > > 2 : somebody suggested that if some of your files are empty you will > encounter the error but I checked and confirm that none of my parquet file > is empty. > > 3.I have same schema across all the files . > > 4. If I query the files individually there is no error and It works fine . > For instance if my query is > SELECT > `COUNTRY`, > `YEAR`, > `MONTH`, > `DAY`, > `Capacity`, > SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` > FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet` > WHERE `YEAR` =2017 AND `MONTH` = 2 > group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` > > Above query works fine for all the countries which I have under countries > directory. > But if I query until directory countries its throws error . > > Appreciate the help! > > Thanks, > Divya >
Re: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes
Can you please try this, since you mention column has empty values try using null instead of zero, because 0 (zero) is not null. CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` AS integer) END AS `Capacity` Thanks, Khurram From: Divya GehlotSent: Monday, August 14, 2017 12:14:26 PM To: user@drill.apache.org Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema changes Hi, I am getting below error : > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes Fragment 0:0 *Some background to understand the error scenario : * I have data saved in parquet files in multidirectory structure. One of the column of the table is empty values so I used below to handle null values > CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS INT) > END AS `Capacity` My query does some aggregation like sum of the column values .My query looks like below : SELECT `COUNTRY`, `YEAR`, `MONTH`, `DAY`, `Capacity`, SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` FROM `dfs`.`default`.`./parquetfiles/countries/` WHERE `YEAR` =2017 AND `MONTH` = 2 group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` When I include `Capacity` column in above query it gives me below error : > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes When I exclude the `Capacity` column the query runs fine and I could view the expected results . Few approaches/verification which I tried after browsing the internet regarding the error whic i encounter : 1 : alter session set `planner.enable_streamagg` = false; alter session set `planner.enable_hashjoin` = false; Still the same error > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema > changes Fragment 2 : somebody suggested that if some of your files are empty you will encounter the error but I checked and confirm that none of my parquet file is empty. 3.I have same schema across all the files . 4. If I query the files individually there is no error and It works fine . For instance if my query is SELECT `COUNTRY`, `YEAR`, `MONTH`, `DAY`, `Capacity`, SUM(CAST(`UserCount` AS INT)) AS `Dailyusers` FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet` WHERE `YEAR` =2017 AND `MONTH` = 2 group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity` Above query works fine for all the countries which I have under countries directory. But if I query until directory countries its throws error . Appreciate the help! Thanks, Divya