I know it doesn't go right to the question of how to make drill ignore things, but could you copy the data into some parallel tree, then rename it into the appropriate directory once the copy is done?
Or could that still cause a running query to fail? On Thursday, June 30, 2016, John Omernik <[email protected]> wrote: > I am doing query of source data that is two levels deep. > > tablename/p_day=2016-05-01/p_hour=1/file1.parquet > > I wasn't able to get wildcards at that level to work with dir0 etc. > > > > > On Thu, Jun 30, 2016 at 12:39 AM, Ted Dunning <[email protected] > <javascript:;>> wrote: > > > Does it work to provide a wild card in your source spec? > > > > a la dfs.tdunning.`/user/tdunning/foo/data/*.parquet` > > > > ? > > > > > > > > On Wed, Jun 29, 2016 at 1:06 PM, John Omernik <[email protected] > <javascript:;>> wrote: > > > > > When the Hadoop FS client copies files (say parquet files) It adds a > > > ._COPYING_ at the end of the file until it's complete. If that's there > > > Drill fails (partial files etc). > > > > > > I know I can ignore files that start with . (or directories) but is > > there a > > > good way to tell Drill to ignore files that are not *.parquet, or that > > have > > > ._COPYING_ at the end of them? > > > > > > Thanks! > > > > > > John > > > > > > -- ---- Vince Gonzalez Systems Engineer 212.694.3879 mapr.com
