I take it back.

I went to run a query, in the same session that had worked, and now I am
getting permission denied.

I do have a query running created new directories every 5 minutes, however,
these aren't the directories that are giving me permission denied.   Did
you try running an aggregate query accross all data? This is a interesting
one to track down, not sure why I am getting the access denied now,

the .drill.parquet_metadata file in the directory that I am getting the
error on is owned by mapr:mapr and has rwxr-xr-x  permissions. This tells
me that both the user of the drillbits (mapr) and the user I am logged into
in sqlline (mapradm) should be able to read the file... so why do I get an
access denied in running a query. I any assistance would be valuable here
in that there are some great performance increases with the metadata
caching, and I don't want to miss out on that.

On Wed, Nov 11, 2015 at 2:18 PM, John Omernik <j...@omernik.com> wrote:

> All files are owned by mapr:mapr?
>
> I have a setup where mapr is the user running the drillbit, but then I
> have a directory that is owned by a another user. mapradm:mapradm on all
> files. (Permissions on directories and files appears to be rwxr-x-r-x) When
> I run the REFRESH TABLE metatdata the .drill.parquet_metadata file gets
> created as mapr:mapr with rwxr-xr-x.
>
> So
> Drillbit User:mapr
> Directory (and subdirectories/files) owner: mapradm:mapradm
> Directory permissions (all files and folder under main directory)
> rwxr-x-r-x
>
> I authenticated to drill via sqlline as user mapradm (this user should be
> able to read and write just fine to all directories).
>
> Now, one thing I did notice is my mapr user was not in the mapradm group,
> therefore, didn't have write permissions anywhere... when I fixed that on
> all nodes, and then I manually deleted the metadatafiles, things seem to be
> working. I wonder if that was my issue?
>
> Basically, the user running the drillbits need to be able to write files
> (the .drill.parquet_metadata)  or something bad will happen :) I will do
> more testing. This may be a good candidate for some documentation work to
> understand what permissions are required to be able to query these.
>
>
>
>
> On Wed, Nov 11, 2015 at 1:36 PM, Vince Gonzalez <vince.gonza...@gmail.com>
> wrote:
>
>> Hi John, I tried this and didn't find any issues. Let me know if I didn't
>> follow your reproduction faithfully.
>>
>> $ sqlline -u jdbc:drill: -n ec2-user -p mapr
>> apache drill 1.2.0
>> "drill baby drill"
>> 0: jdbc:drill:> refresh table metadata dfs.`/tmp/flows`;
>> +-------+------------------------------------------------------+
>> |  ok   |                       summary                        |
>> +-------+------------------------------------------------------+
>> | true  | Successfully updated metadata for table /tmp/flows.  |
>> +-------+------------------------------------------------------+
>> 1 row selected (32.27 seconds)
>> 0: jdbc:drill:> select srcIP,dstIP from dfs.`/tmp/flows` limit 12;
>> +---------------+---------------+
>> |     srcIP     |     dstIP     |
>> +---------------+---------------+
>> | 172.16.2.152  | 172.16.1.58   |
>> | 172.16.1.58   | 172.16.2.152  |
>> | 172.16.2.152  | 172.16.2.73   |
>> | 172.16.2.152  | 172.16.2.73   |
>> | 172.16.2.73   | 172.16.2.152  |
>> | 172.16.2.152  | 172.16.2.73   |
>> | 172.16.2.152  | 172.16.2.73   |
>> | 172.16.2.152  | 172.16.2.73   |
>> | 172.16.2.73   | 172.16.2.152  |
>> | 172.16.2.73   | 172.16.2.152  |
>> | 172.16.2.73   | 172.16.2.152  |
>> | 172.16.2.152  | 172.16.2.73   |
>> +---------------+---------------+
>> 12 rows selected (5.654 seconds)
>>
>> And here's what my table structure looks like (as seen via MapR NFS):
>>
>> $ tree /mapr/vgonzalez.drill/tmp/flows/ | head -15
>> /mapr/vgonzalez.drill/tmp/flows/
>> └── 2015
>>     └── 11
>>         ├── 10
>>         │   ├── 21
>>         │   │   ├── 39
>>         │   │   │   ├── 03
>>         │   │   │   │   ├── _common_metadata
>>         │   │   │   │   ├── _metadata
>>         │   │   │   │   ├──
>> part-r-00000-853882bd-66d8-4505-96ba-f0a282e374de.gz.parquet
>>         │   │   │   │   └── _SUCCESS
>>         │   │   │   └── 20
>>         │   │   │       ├── _common_metadata
>>         │   │   │       ├── _metadata
>>         │   │   │       ├──
>> part-r-00000-37a94549-8e56-46d5-be88-cb28e6d8bc35.gz.parquet
>>
>> My parquet was created in Spark, not Drill. Not sure if that's relevant.
>>
>> I have authentication and impersonation turned on, and the files are owned
>> by mapr:mapr. Here's my drill-override.conf:
>>
>> drill.exec: {
>>   cluster-id: "vgonzalez_drill-drillbits",
>> zk.connect:
>>
>> "ip-172-16-2-36.ec2.internal:5181,ip-172-16-2-37.ec2.internal:5181,ip-172-16-2-38.ec2.internal:5181"
>> }
>> drill.exec.impersonation: { enabled: true, max_chained_user_hops: 3 }
>> drill.exec { security.user.auth { enabled: true, packages +=
>> "org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [
>> "login","sudo","sshd","password-auth" ] } }
>>
>>
>>
>>
>>
>> On Tue, Nov 10, 2015 at 1:17 PM, John Omernik <j...@omernik.com> wrote:
>>
>> > Cool, looking forward to it.
>> >
>> > On Mon, Nov 9, 2015 at 7:21 PM, Vince Gonzalez <
>> vince.gonza...@gmail.com>
>> > wrote:
>> >
>> > > Hey John, I have a secure cluster and some parquet files, I'll try
>> this
>> > out
>> > > and report back.
>> > >
>> > > On Monday, November 9, 2015, John Omernik <j...@omernik.com> wrote:
>> > >
>> > > > Has anyone been able to try/test this? I am curious if it's me only
>> > issue
>> > > > or something more of bug so I can open a JIRA if needed.
>> > > >
>> > > > John
>> > > >
>> > > > On Fri, Nov 6, 2015 at 11:06 AM, John Omernik <j...@omernik.com
>> > > > <javascript:;>> wrote:
>> > > >
>> > > > > If someone has authorization/authentication setup, to reproduce:
>> > > > >
>> > > > > Have a Parquet table with directories underneath the main (I have
>> > > > > directories per day)
>> > > > >
>> > > > > Then issue REFRESH TABLE METADATA on the root of the table
>> running an
>> > > > > authenticated user other than the drill bit user. (I am using
>> mapr, I
>> > > > used
>> > > > > my user to run the query, and yes I have access to the data)
>> > > > >
>> > > > > Then run a normal query and see what the result is. .
>> > > > >
>> > > > > John
>> > > > >
>> > > > > On Fri, Nov 6, 2015 at 10:22 AM, Neeraja Rentachintala <
>> > > > > nrentachint...@maprtech.com <javascript:;>> wrote:
>> > > > >
>> > > > >> This doesn't make sense and seems like a bug.
>> > > > >> I think the right behavior is for the Drillbit to access the
>> cache
>> > as
>> > > > >> Drillbit user at the query time (there is no user level metadata
>> > cache
>> > > > in
>> > > > >> Drill at this point).
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> On Fri, Nov 6, 2015 at 6:57 AM, John Omernik <j...@omernik.com
>> > > > <javascript:;>> wrote:
>> > > > >>
>> > > > >> > I ran REFRESH TABLE METADATA on a table, it completed
>> > successfully.
>> > > > >> >
>> > > > >> > When I tried a subsequent query, I get a IOException:
>> Permission
>> > > > Denied
>> > > > >> on
>> > > > >> > .drill.parquet_metadata.
>> > > > >> >
>> > > > >> > I am running drill with authentication.  I ran the REFRESH
>> TABLE
>> > > > >> METADATA
>> > > > >> > as user X, it appears the .drill.parquet_metadata was created
>> and
>> > > > owned
>> > > > >> by
>> > > > >> > the user the drill bits are running as as is created with
>> > > -rwxr-x-r-x
>> > > > >> >
>> > > > >> > My question is this: So, I can see why the file is owned by the
>> > > drill
>> > > > >> bit
>> > > > >> > user, and the file is created with all can read permissions,
>> but
>> > why
>> > > > am
>> > > > >> I
>> > > > >> > getting a permission denied when user X is trying to run a
>> query?
>> > > > >> >
>> > > > >>
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to