Related weird behavior regarding Hive partitioned directories as dfs storage.
I first created a view create view tmp_view as select cast(substr(`dir0`, 6,4) as int) as `year`, cast(aaa as varchar(100)) as aaa from dfs.root.`/user/hive/warehouse/table` o; select aaa from tmp_view where `year` between 2010 and 2012 limit 5; returns following 5 rows. +--------+ | V571 | | V571 | | 8363 | | V8281 | | 59970 | ... good. Then, select aaa from tmp_view where `year` between 2010 and 2012 and aaa like '%V571%' limit 5; returns no row... Sungwook On Sun, Aug 23, 2015 at 5:23 PM, Sungwook Yoon <sy...@maprtech.com> wrote: > > So, I filed the issue here, > > https://issues.apache.org/jira/browse/DRILL-3692 > > If more details are needed let me know. > > Sungwook > > > On Sun, Aug 23, 2015 at 2:45 PM, Aman Sinha <asi...@maprtech.com> wrote: > >> Yes, I just realized that and was about to respond to my prior message. >> I just tested with a directory structure similar to Sungwook's (where >> directories are named with 'year=2012' format) and it works for me. >> But I am on the current master branch. >> In the original message 'Sometimes it picks up e.g., year=2010, but not >> year=2012..' that clearly sounds like wrong result... >> definitely file a JIRA with a repro. >> >> Aman >> >> On Sun, Aug 23, 2015 at 12:23 PM, Jacques Nadeau <jacq...@dremio.com> >> wrote: >> >> > The way that Sungwook is describing the issue, it has nothing to do with >> > Hive. The files were generated via Hive but he is querying directly >> > through the DFS schema. >> > >> > -- >> > Jacques Nadeau >> > CTO and Co-Founder, Dremio >> > >> > On Sun, Aug 23, 2015 at 12:20 PM, Aman Sinha <asi...@maprtech.com> >> wrote: >> > >> > > Sungwook, do you have the latest master build which has the fix for >> Hive >> > > partition pruning (DRILL-3121) ? >> > > >> > > On Sun, Aug 23, 2015 at 12:15 PM, Sungwook Yoon <sy...@maprtech.com> >> > > wrote: >> > > >> > > > Will do, >> > > > >> > > > Thanks, >> > > > >> > > > Sungwook >> > > > >> > > > >> > > > On Sun, Aug 23, 2015 at 2:14 PM, Jacques Nadeau <jacq...@dremio.com >> > >> > > > wrote: >> > > > >> > > > > It sounds like a bug. Can you file a jira? >> > > > > >> > > > > -- >> > > > > Jacques Nadeau >> > > > > CTO and Co-Founder, Dremio >> > > > > >> > > > > On Sun, Aug 23, 2015 at 12:13 PM, Sungwook Yoon < >> sy...@maprtech.com> >> > > > > wrote: >> > > > > >> > > > > > Hi Jacques, >> > > > > > >> > > > > > This works well, no problem of accessing the partitioned dirs. >> > > > > > (and actually pretty faster than accessing from one level above) >> > > > > > >> > > > > > Just the issues I asked about, when I access from the >> > > > > > /user/hive/warehouse/table, it somehow does not recover every >> dir0. >> > > > > > >> > > > > > Sungwook >> > > > > > >> > > > > > >> > > > > > On Sun, Aug 23, 2015 at 2:02 PM, Jacques Nadeau < >> > jacq...@dremio.com> >> > > > > > wrote: >> > > > > > >> > > > > > > I think Hsuan misunderstood your question. >> > > > > > > >> > > > > > > Can you let us know what you get if you query: >> > > > > > > >> > > > > > > select * from dfs.root.`/user/hive/warehouse/table/year=2012` >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Jacques Nadeau >> > > > > > > CTO and Co-Founder, Dremio >> > > > > > > >> > > > > > > On Sun, Aug 23, 2015 at 7:07 AM, Sungwook Yoon < >> > sy...@maprtech.com >> > > > >> > > > > > wrote: >> > > > > > > >> > > > > > > > Hi, >> > > > > > > > >> > > > > > > > I am trying to use Hive parquet stored files partitioned by >> > some >> > > > > > column. >> > > > > > > > So, the directory structure is partitioned with the column. >> > > > > > > > >> > > > > > > > The column is actually year. >> > > > > > > > Let's say there are 5 years, so dir0 are like year=2010, >> > > > > > > > year=2011,year=2012,year=2013,year=2014 >> > > > > > > > >> > > > > > > > We did like following >> > > > > > > > select * from dfs.root.`/user/hive/warehouse/table` d where >> > > d.dir0 >> > > > = >> > > > > > > > 'year=2012'; >> > > > > > > > >> > > > > > > > I get nothing. >> > > > > > > > Apparently, there are parquet files in the directory though. >> > > > > > > > >> > > > > > > > Sometimes it picks up e.g., year=2010, but not year=2012.. >> > > > > > > > >> > > > > > > > Where am I going wrong with this? >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > >> > > > > > > > Sungwook >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >