Related weird behavior regarding Hive partitioned directories as dfs
storage.

I first created a view
create view tmp_view as select cast(substr(`dir0`, 6,4) as int) as `year`,
cast(aaa as varchar(100)) as aaa from dfs.root.`/user/hive/warehouse/table`
o;

select aaa from tmp_view where `year` between 2010 and 2012  limit 5;
returns following 5 rows.
+--------+
| V571   |
| V571   |
| 8363   |
| V8281  |
| 59970  |

... good.

Then,

select aaa from tmp_view where `year` between 2010 and 2012 and aaa like
'%V571%' limit 5;

returns no row...

Sungwook



On Sun, Aug 23, 2015 at 5:23 PM, Sungwook Yoon <sy...@maprtech.com> wrote:

>
> So, I filed the issue here,
>
> https://issues.apache.org/jira/browse/DRILL-3692
>
> If more details are needed let me know.
>
> Sungwook
>
>
> On Sun, Aug 23, 2015 at 2:45 PM, Aman Sinha <asi...@maprtech.com> wrote:
>
>> Yes, I just realized that and was about to respond to my prior message.
>> I just tested with a directory structure similar to Sungwook's  (where
>> directories are named with 'year=2012' format) and it works for me.
>> But I am on the current master branch.
>> In the original message 'Sometimes it picks up e.g., year=2010, but not
>> year=2012..'   that clearly sounds like wrong result...
>> definitely file a JIRA with a repro.
>>
>> Aman
>>
>> On Sun, Aug 23, 2015 at 12:23 PM, Jacques Nadeau <jacq...@dremio.com>
>> wrote:
>>
>> > The way that Sungwook is describing the issue, it has nothing to do with
>> > Hive.  The files were generated via Hive but he is querying directly
>> > through the DFS schema.
>> >
>> > --
>> > Jacques Nadeau
>> > CTO and Co-Founder, Dremio
>> >
>> > On Sun, Aug 23, 2015 at 12:20 PM, Aman Sinha <asi...@maprtech.com>
>> wrote:
>> >
>> > > Sungwook, do you have the latest master build which has the fix for
>> Hive
>> > > partition pruning (DRILL-3121) ?
>> > >
>> > > On Sun, Aug 23, 2015 at 12:15 PM, Sungwook Yoon <sy...@maprtech.com>
>> > > wrote:
>> > >
>> > > > Will do,
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Sungwook
>> > > >
>> > > >
>> > > > On Sun, Aug 23, 2015 at 2:14 PM, Jacques Nadeau <jacq...@dremio.com
>> >
>> > > > wrote:
>> > > >
>> > > > > It sounds like a bug. Can you file a jira?
>> > > > >
>> > > > > --
>> > > > > Jacques Nadeau
>> > > > > CTO and Co-Founder, Dremio
>> > > > >
>> > > > > On Sun, Aug 23, 2015 at 12:13 PM, Sungwook Yoon <
>> sy...@maprtech.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Jacques,
>> > > > > >
>> > > > > > This works well, no problem of accessing the partitioned dirs.
>> > > > > > (and actually pretty faster than accessing from one level above)
>> > > > > >
>> > > > > > Just the issues I asked about, when I access from the
>> > > > > > /user/hive/warehouse/table, it somehow does not recover every
>> dir0.
>> > > > > >
>> > > > > > Sungwook
>> > > > > >
>> > > > > >
>> > > > > > On Sun, Aug 23, 2015 at 2:02 PM, Jacques Nadeau <
>> > jacq...@dremio.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > I think Hsuan misunderstood your question.
>> > > > > > >
>> > > > > > > Can you let us know what you get if you query:
>> > > > > > >
>> > > > > > > select * from dfs.root.`/user/hive/warehouse/table/year=2012`
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > Jacques Nadeau
>> > > > > > > CTO and Co-Founder, Dremio
>> > > > > > >
>> > > > > > > On Sun, Aug 23, 2015 at 7:07 AM, Sungwook Yoon <
>> > sy...@maprtech.com
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > I am trying to use Hive parquet stored files partitioned by
>> > some
>> > > > > > column.
>> > > > > > > > So, the directory structure is partitioned with the column.
>> > > > > > > >
>> > > > > > > > The column is actually year.
>> > > > > > > > Let's say there are 5 years, so dir0 are like year=2010,
>> > > > > > > > year=2011,year=2012,year=2013,year=2014
>> > > > > > > >
>> > > > > > > > We did like following
>> > > > > > > > select * from dfs.root.`/user/hive/warehouse/table` d where
>> > > d.dir0
>> > > > =
>> > > > > > > > 'year=2012';
>> > > > > > > >
>> > > > > > > > I get nothing.
>> > > > > > > > Apparently, there are parquet files in the directory though.
>> > > > > > > >
>> > > > > > > > Sometimes it picks up e.g., year=2010, but not year=2012..
>> > > > > > > >
>> > > > > > > > Where am I going wrong with this?
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > >
>> > > > > > > > Sungwook
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to