Arg, this problem is crazy. (I'll put this in the JIRA too) So after waiting a while, and loading more data. I tried to refresh table metadata on the table, using the dataadm user (basically the user who owns the data). Note all directories and files are owned by dataadm:dataadm and the permissions are 770. This worked before, but this time, when I ran
REFRESH TABLE METADATA mytable; I get "false| Error: 2126.29602.2546226 /data/prod/mytable/2015-011-12/.drill.parquet_metadata (Permission denied)12:44 This is the SAME shell where I ran it before, and I loaded more data (note the directory in question was already loaded, that was no touched). Then I use the find command to remove all the .drill.parquet_metadata files. and run the REFRESH TABLE METADATA command again: This time the command works. Great. If I run it again, right after: It runs successfully again. 12:35 Ran it a third time, and it worked. 12:37 Ran it a fourth time: and it worked. (Note all the parquet_metadata files are owned by my drillbituser: drillbitgroup (in this case, mapr:mapr) despite the meta operation being done by the data owner. 12:39 Another process *running as dataadm* loaded a new day of data (2016-02-12) No other data was altered here. 12:40 Ran REFRESH TABLE METADATA a fifth time: Got the error. Maybe it has to do with adding data? Error on 2015-11-12 again.... 12:41 A new Process loaded more data. (2016-02-11, and 2016-02-10 loaded) Process completes succesfully, disabled at this time. for troubleshooting (not more data being loaded) 12:42 Attempt REFRESH TABLE METADATA again, same error on 2015-11-12 12:43 Removed all .drill.parquet_metadata files using find command 12:44 Ran REFRESH TABLE METADATA - This time ran with success. Will now run and check without data loading. May have to do with data loading... 12:52 Ran REFRESH: Success 12:58 Ran REFRESH: Success 1:00 Forced Reload of 2016-02-15. Basically making it so the folder "2016-02-15" did not have a .drill.parquet_metadata file (while the other days did) 1:01 Ran REFRESH : Error: 2126.27460.2555888 /data/prod/mytable/2015-11-12/.drill.parquet_metadata (Permission denied) (Same file, not sure why it picks on this file, nothing is changed there) (Even validated, no files modifed since 12:58 when the parquet_metadata file was modified, all parquet files still have the same modified times of when they were loaded, Feb 9th) So thoughts: 1. When running REFRESH TABLE METADATA, it checks to see if all the files in the subdirectories exist, if they don't it starts to "do things" 2. The date 2015-11-12 probably keeps coming out is because it's first in .drill.parquet_metadata located in /mytable (not in the individual directories) 3. After the REFRESH failed, I checked some files. 2015-11-12/.drill.parquet_metadata was a 0 size files. (Like it was attempted to be rewritten and failed) Looking in 2016-11-13, the .drill.parquet_metadata file has data in it. 4. To test #3, I rm .drill.parquet_metadata from 2015-11-12, and run the refresh command again. Interesting... when I do that, I get permissioned denied on the 2015-11-12 directory again, this time, intead of the file owned by the driillbit user (and having the drillbit user group, in this case mapr) I have a file of 0 bytes, with "dataadm:datareaders" as the owner. That's interesting... shouldn't it be mapr:mapr (the drillbit user?) So this seems to be the crux of the issue... what should happen here? all metadata operations be checked to see if the user issuing it has permissions, and then writes happening as the drillbit user? Any other thoughts here? On Mon, Feb 15, 2016 at 10:20 AM, John Omernik <[email protected]> wrote: > So I am not sure what's happened here. The JIRA isn't filled out, but I > can't seem to reproduce the problem. Was this stealth fixed? Based on some > testing, even when the data directory is owned by a different user than the > drillbit, the .parquet_metadata files are created as mapr:mapr with 755 > permissions. And when it refreshes now, there are no errors. So Maybe all > fixed? > > Thanks > > On Sun, Feb 14, 2016 at 2:20 PM, John Omernik <[email protected]> wrote: > >> I'd like to revive this thread. Specifically, what should the expect >> behavior of the refresh metadata be when running with impersonation? >> >> Drill Bit User: mapr >> Data User (owner): jdoe >> Authenticated User: jdoe >> >> So if a base folder, mytable, has subdirectories of dates, 2015-01-01, >> 2015-01-02 etc. And all the data is owned by jdoe:datareaders, and the >> permissions are 750 on all directories and files, how SHOULD the REFRESH >> METADATA command be expected to operated if run in sqlline authenticated as >> jdoe? (What will the permissions on the metadata files be etc) >> >> >> >> >> >> On Mon, Nov 30, 2015 at 10:16 AM, Jacques Nadeau <[email protected]> >> wrote: >> >>> > >>> > The output from Drill and the Markup interpreter on Jira apparently >>> had a >>> > family argument at Thanksgiving, and don't agree on all things... >>> >>> >>> Made my morning :) >>> >> >> >
