All files are owned by mapr:mapr? I have a setup where mapr is the user running the drillbit, but then I have a directory that is owned by a another user. mapradm:mapradm on all files. (Permissions on directories and files appears to be rwxr-x-r-x) When I run the REFRESH TABLE metatdata the .drill.parquet_metadata file gets created as mapr:mapr with rwxr-xr-x.
So Drillbit User:mapr Directory (and subdirectories/files) owner: mapradm:mapradm Directory permissions (all files and folder under main directory) rwxr-x-r-x I authenticated to drill via sqlline as user mapradm (this user should be able to read and write just fine to all directories). Now, one thing I did notice is my mapr user was not in the mapradm group, therefore, didn't have write permissions anywhere... when I fixed that on all nodes, and then I manually deleted the metadatafiles, things seem to be working. I wonder if that was my issue? Basically, the user running the drillbits need to be able to write files (the .drill.parquet_metadata) or something bad will happen :) I will do more testing. This may be a good candidate for some documentation work to understand what permissions are required to be able to query these. On Wed, Nov 11, 2015 at 1:36 PM, Vince Gonzalez <vince.gonza...@gmail.com> wrote: > Hi John, I tried this and didn't find any issues. Let me know if I didn't > follow your reproduction faithfully. > > $ sqlline -u jdbc:drill: -n ec2-user -p mapr > apache drill 1.2.0 > "drill baby drill" > 0: jdbc:drill:> refresh table metadata dfs.`/tmp/flows`; > +-------+------------------------------------------------------+ > | ok | summary | > +-------+------------------------------------------------------+ > | true | Successfully updated metadata for table /tmp/flows. | > +-------+------------------------------------------------------+ > 1 row selected (32.27 seconds) > 0: jdbc:drill:> select srcIP,dstIP from dfs.`/tmp/flows` limit 12; > +---------------+---------------+ > | srcIP | dstIP | > +---------------+---------------+ > | 172.16.2.152 | 172.16.1.58 | > | 172.16.1.58 | 172.16.2.152 | > | 172.16.2.152 | 172.16.2.73 | > | 172.16.2.152 | 172.16.2.73 | > | 172.16.2.73 | 172.16.2.152 | > | 172.16.2.152 | 172.16.2.73 | > | 172.16.2.152 | 172.16.2.73 | > | 172.16.2.152 | 172.16.2.73 | > | 172.16.2.73 | 172.16.2.152 | > | 172.16.2.73 | 172.16.2.152 | > | 172.16.2.73 | 172.16.2.152 | > | 172.16.2.152 | 172.16.2.73 | > +---------------+---------------+ > 12 rows selected (5.654 seconds) > > And here's what my table structure looks like (as seen via MapR NFS): > > $ tree /mapr/vgonzalez.drill/tmp/flows/ | head -15 > /mapr/vgonzalez.drill/tmp/flows/ > └── 2015 > └── 11 > ├── 10 > │ ├── 21 > │ │ ├── 39 > │ │ │ ├── 03 > │ │ │ │ ├── _common_metadata > │ │ │ │ ├── _metadata > │ │ │ │ ├── > part-r-00000-853882bd-66d8-4505-96ba-f0a282e374de.gz.parquet > │ │ │ │ └── _SUCCESS > │ │ │ └── 20 > │ │ │ ├── _common_metadata > │ │ │ ├── _metadata > │ │ │ ├── > part-r-00000-37a94549-8e56-46d5-be88-cb28e6d8bc35.gz.parquet > > My parquet was created in Spark, not Drill. Not sure if that's relevant. > > I have authentication and impersonation turned on, and the files are owned > by mapr:mapr. Here's my drill-override.conf: > > drill.exec: { > cluster-id: "vgonzalez_drill-drillbits", > zk.connect: > > "ip-172-16-2-36.ec2.internal:5181,ip-172-16-2-37.ec2.internal:5181,ip-172-16-2-38.ec2.internal:5181" > } > drill.exec.impersonation: { enabled: true, max_chained_user_hops: 3 } > drill.exec { security.user.auth { enabled: true, packages += > "org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [ > "login","sudo","sshd","password-auth" ] } } > > > > > > On Tue, Nov 10, 2015 at 1:17 PM, John Omernik <j...@omernik.com> wrote: > > > Cool, looking forward to it. > > > > On Mon, Nov 9, 2015 at 7:21 PM, Vince Gonzalez <vince.gonza...@gmail.com > > > > wrote: > > > > > Hey John, I have a secure cluster and some parquet files, I'll try this > > out > > > and report back. > > > > > > On Monday, November 9, 2015, John Omernik <j...@omernik.com> wrote: > > > > > > > Has anyone been able to try/test this? I am curious if it's me only > > issue > > > > or something more of bug so I can open a JIRA if needed. > > > > > > > > John > > > > > > > > On Fri, Nov 6, 2015 at 11:06 AM, John Omernik <j...@omernik.com > > > > <javascript:;>> wrote: > > > > > > > > > If someone has authorization/authentication setup, to reproduce: > > > > > > > > > > Have a Parquet table with directories underneath the main (I have > > > > > directories per day) > > > > > > > > > > Then issue REFRESH TABLE METADATA on the root of the table running > an > > > > > authenticated user other than the drill bit user. (I am using > mapr, I > > > > used > > > > > my user to run the query, and yes I have access to the data) > > > > > > > > > > Then run a normal query and see what the result is. . > > > > > > > > > > John > > > > > > > > > > On Fri, Nov 6, 2015 at 10:22 AM, Neeraja Rentachintala < > > > > > nrentachint...@maprtech.com <javascript:;>> wrote: > > > > > > > > > >> This doesn't make sense and seems like a bug. > > > > >> I think the right behavior is for the Drillbit to access the cache > > as > > > > >> Drillbit user at the query time (there is no user level metadata > > cache > > > > in > > > > >> Drill at this point). > > > > >> > > > > >> > > > > >> > > > > >> On Fri, Nov 6, 2015 at 6:57 AM, John Omernik <j...@omernik.com > > > > <javascript:;>> wrote: > > > > >> > > > > >> > I ran REFRESH TABLE METADATA on a table, it completed > > successfully. > > > > >> > > > > > >> > When I tried a subsequent query, I get a IOException: Permission > > > > Denied > > > > >> on > > > > >> > .drill.parquet_metadata. > > > > >> > > > > > >> > I am running drill with authentication. I ran the REFRESH TABLE > > > > >> METADATA > > > > >> > as user X, it appears the .drill.parquet_metadata was created > and > > > > owned > > > > >> by > > > > >> > the user the drill bits are running as as is created with > > > -rwxr-x-r-x > > > > >> > > > > > >> > My question is this: So, I can see why the file is owned by the > > > drill > > > > >> bit > > > > >> > user, and the file is created with all can read permissions, but > > why > > > > am > > > > >> I > > > > >> > getting a permission denied when user X is trying to run a > query? > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >