If my setup is not valid, can someone elaborate on what is required?  User 
impersonation?  Running kerberos between the client and the drillbits?

Again, hdfs requires kerberos but I don't need it for Drill itself.

Thanks,
Clark

On 7/24/20, 11:42 AM, "Updike, Clark" <clark.upd...@jhuapl.edu> wrote:
    
    Yes, I've read that page but it wasn't clear to me how much of it applied.  
I don't need kerberos auth from the client to the drillbits.  But the drillbits 
must use kerberos auth when interacting with hdfs.  By putting the principal 
and keytab info into the drillbit config (drill-override.conf), and not using 
impersonation, and no security.user settings,  I thought that was what I was 
effectively doing. And it at least partially works since SHOW FILES works.  
    
    Is this not a valid setup?
    
    On 7/24/20, 11:23 AM, "Charles Givre" <cgi...@gmail.com> wrote:
    
        Hey Clark, 
        Have you gone through this:
        https://drill.apache.org/docs/configuring-kerberos-security/ 
<https://drill.apache.org/docs/configuring-kerberos-security/>
        
        As Paul indicated, this does seem like the likely suspect as to why 
this isn't working or at least the next thing to verify.  I'm surprised you're 
able to connect at all. I would have expected you to get connection denied when 
you tried the SHOW FILES query if Kerberos was not configured correctly.
        
        -- C
        
        > On Jul 24, 2020, at 11:14 AM, Updike, Clark <clark.upd...@jhuapl.edu> 
wrote:
        > 
        > Using CDH version of 2.6.0.  
        > 
        > I was not able to find any errors on the Drill side besides what I 
already provided from sqlline.  However, I did find an exception on some of the 
datanodes (below).
        > 
        > Everything works find using hdfs cli commands (ls, get, cat).
        > 
        > I have set up security.auth.principal and security.auth.keytab for 
drill.exec in drill-override.conf.  That's what got SHOW FILES working.  
However, I have not been doing kerberos auth when using Sqlline.  
        > 
        > Is there any chance that SHOW FILES can work when Sqlline is not 
authenticated using kerberos, but the actual query requires Sqlline kerberos 
auth?  That might explain it if that's how it worked.  Note the only thing 
running kerberos is HDFS (not using kerberos on the Drill parts).
        > 
        > STACKTRACE FROM DATANODE
        >     dn003:20003:DataXceiver error processing unknown operation  src: 
/xx.xx.xx.22:53154 dst: /xx.xx.xx.23:20003
        >     java.io.IOException: 
        >          at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessage(DataTransferSaslUtil.java:217)
        >          at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:364)
        >          at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
        >          at 
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
        >          at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
        >          at java.lang.Thread.run(Thread.java:745)
        > 
        > Thanks,
        > Clark
        > 
        > On 7/23/20, 6:15 PM, "Paul Rogers" <par0...@gmail.com> wrote:
        > 
        >    Hi Clark,
        > 
        >    Security was going to be my next question. The stack trace didn't 
look like
        >    one where the file open would fail: the planner doesn't actually 
open a
        >    JSON file. There is no indication of the HDFS call that might have 
failed.
        >    Another question is: what version of HDFS are you using? I wonder 
if there
        >    is a conflict somewhere.
        > 
        >    Although the stack trace does not tell us which file-system call 
failed,
        >    the logs might. Can you check your Drill log file for entries at 
the time
        >    of failure? Is there additional information about the specific 
operation
        >    which failed?
        > 
        >    What happens if you try to download the file using the command 
line HDFS
        >    tools? Does that work? This test might verify that HDFS itself is 
sane and
        >    that the security settings work.
        > 
        >    Setting up Kerberos in Drill is documented on the web site. You 
probably
        >    went through the steps there to ensure Drill has the needed info?
        > 
        >    Thanks,
        > 
        >    - Paul
        > 
        >    On Thu, Jul 23, 2020 at 11:26 AM Updike, Clark 
<clark.upd...@jhuapl.edu>
        >    wrote:
        > 
        >> I should mention that this is a kerberized HDFS cluster.  I'm still 
not
        >> sure why the SHOW FILES would work but the query would not--but it 
could be
        >> behind the issue somehow.
        >> 
        >> On 7/23/20, 2:18 PM, "Updike, Clark" <clark.upd...@jhuapl.edu> wrote:
        >> 
        >>    No change unfortunately:
        >> 
        >>    apache drill> select * from hdfs.`root`.`/tmp/employee.json`;
        >>    Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 
18:
        >> Object '/tmp/employee.json' not found within 'hdfs.root'
        >> 
        >>    On 7/23/20, 2:11 PM, "Paul Rogers" <par0...@gmail.com> wrote:
        >> 
        >>        Hi Clark,
        >> 
        >>        Try using `hdfs`.`root` rather than `hdfs.root`. Calcite 
wants to
        >> walk down
        >>        `hdfs` then `root`. There is no workspace called `hdfs.root`.
        >> 
        >>        Thanks,
        >> 
        >>        - Paul
        >> 
        >>        On Thu, Jul 23, 2020 at 8:58 AM Updike, Clark <
        >> clark.upd...@jhuapl.edu>
        >>        wrote:
        >> 
        >>> Oops, sorry.  No luck there either unfortunately:
        >>> 
        >>> apache drill> SELECT * FROM hdfs.`/tmp/employee.json`;
        >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1,
        >> column 18:
        >>> Object '/tmp/employee.json' not found within 'hdfs'
        >>> 
        >>> 
        >>> On 7/23/20, 11:52 AM, "Charles Givre" <cgi...@gmail.com> wrote:
        >>> 
        >>>    Oh.. I meant:
        >>> 
        >>>    SELECT *
        >>>    FROM hdfs.`/tmp/employee.json`
        >>> 
        >>>> On Jul 23, 2020, at 11:41 AM, Updike, Clark <
        >> clark.upd...@jhuapl.edu>
        >>> wrote:
        >>>> 
        >>>> No change unfortunately...
        >>>> 
        >>>> $ hdfs dfs -ls hdfs://nn01:8020/tmp/employee.json
        >>>> -rw-r--r--   2 me supergroup     474630 2020-07-23 10:53
        >>> hdfs://nn01:8020/tmp/employee.json
        >>>> 
        >>>> apache drill> select * from
        >>> hdfs.root.`hdfs://nn01:8020/tmp/employee.json`;
        >>>> Error: VALIDATION ERROR: From line 1, column 15 to line 1,
        >> column
        >>> 18: Object 'hdfs://nn01:8020/tmp/employee.json' not found within
        >> 'hdfs.root'
        >>>> 
        >>>> 
        >>>> On 7/23/20, 11:30 AM, "Charles Givre" <cgi...@gmail.com>
        >> wrote:
        >>>> 
        >>>>   Hi Clark,
        >>>>   That's strange.  My initial thought is that this could
        >> be a
        >>> permission issue.  However, it might also be that Drill isn't
        >> finding the
        >>> file for some reason.
        >>>> 
        >>>>   Could you try:
        >>>> 
        >>>>   SELECT *
        >>>>   FROM hdfs.`<full hdfs path to file>`
        >>>> 
        >>>>   Best,
        >>>>   --- C
        >>>> 
        >>>> 
        >>>>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <
        >>> clark.upd...@jhuapl.edu> wrote:
        >>>>> 
        >>>>> This is in 1.17.  I can use SHOW FILES to list the file
        >> I'm
        >>> targeting, but I cannot query it:
        >>>>> 
        >>>>> apache drill> show files in
        >> hdfs.root.`/tmp/employee.json`;
        >>>>> 
        >>> 
        >> 
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
        >>>>> |     name      | isDirectory | isFile | length |  owner
        >> |
        >>> group    | permissions |       accessTime        |
        >> modificationTime
        >>> |
        >>>>> 
        >>> 
        >> 
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
        >>>>> | employee.json | false       | true   | 474630 | me
        >> |
        >>> supergroup | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23
        >>> 10:53:15.387 |
        >>>>> 
        >>> 
        >> 
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
        >>>>> 1 row selected (3.039 seconds)
        >>>>> 
        >>>>> 
        >>>>> apache drill> select * from
        >> hdfs.root.`/tmp/employee.json`;
        >>>>> Error: VALIDATION ERROR: From line 1, column 15 to line
        >> 1, column
        >>> 18: Object '/tmp/employee.json' not found within 'hdfs.root'
        >>>>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ]
        >> (state=,code=0)
        >>>>> 
        >>>>> The storage plugin uses the standard json config:
        >>>>> 
        >>>>>  "json": {
        >>>>>    "type": "json",
        >>>>>    "extensions": [
        >>>>>      "json"
        >>>>>    ]
        >>>>>  },
        >>>>> 
        >>>>> I can't see any problems on the HDFS side.  Full stack
        >> trace is
        >>> below.
        >>>>> 
        >>>>> Any ideas what could be causing this behavior?
        >>>>> 
        >>>>> Thanks, Clark
        >>>>> 
        >>>>> 
        >>>>> 
        >>>>> FULL STACKTRACE:
        >>>>> 
        >>>>> apache drill> select * from
        >> hdfs.root.`/tmp/employee.json`;
        >>>>> Error: VALIDATION ERROR: From line 1, column 15 to line
        >> 1, column
        >>> 18: Object '/tmp/employee.json' not found within 'hdfs.root'
        >>>>> 
        >>>>> 
        >>>>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
        >>>>> 
        >>>>> (org.apache.calcite.runtime.CalciteContextException) From
        >> line 1,
        >>> column 15 to line 1, column 18: Object '/tmp/employee.json' not
        >> found
        >>> within 'hdfs.root'
        >>>>> 
        >> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
        >>>>> 
        >> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
        >>>>> 
        >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
        >>>>>  java.lang.reflect.Constructor.newInstance():423
        >>>>> 
        >> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
        >>>>>  org.apache.calcite.sql.SqlUtil.newContextException():824
        >>>>>  org.apache.calcite.sql.SqlUtil.newContextException():809
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
        >>>>> 
        >> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
        >>>>> 
        >> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
        >>>>> 
        >> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
        >>>>>  org.apache.calcite.sql.SqlSelect.validate():216
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
        >>>>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
        >>>>> 
        >> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
        >>>>> 
        >>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
        >>>>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
        >>>>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
        >>>>>  org.apache.drill.exec.work.foreman.Foreman.runSQL():590
        >>>>>  org.apache.drill.exec.work.foreman.Foreman.run():275
        >>>>>  java.util.concurrent.ThreadPoolExecutor.runWorker():1142
        >>>>>  java.util.concurrent.ThreadPoolExecutor$Worker.run():617
        >>>>>  java.lang.Thread.run():745
        >>>>> Caused By
        >> (org.apache.calcite.sql.validate.SqlValidatorException)
        >>> Object '/tmp/employee.json' not found within 'hdfs.root'
        >>>>> 
        >> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
        >>>>> 
        >> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
        >>>>> 
        >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
        >>>>>  java.lang.reflect.Constructor.newInstance():423
        >>>>> 
        >> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
        >>>>>  org.apache.calcite.runtime.Resources$ExInst.ex():572
        >>>>>  org.apache.calcite.sql.SqlUtil.newContextException():824
        >>>>>  org.apache.calcite.sql.SqlUtil.newContextException():809
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
        >>>>> 
        >> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
        >>>>> 
        >> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
        >>>>> 
        >> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
        >>>>> 
        >>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
        >>>>>  org.apache.calcite.sql.SqlSelect.validate():216
        >>>>> 
        >>> 
        >> 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
        >>>>> 
        >> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
        >>>>> 
        >> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
        >>>>> 
        >>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
        >>>>> 
        >>> 
        >> 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
        >>>>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
        >>>>> 
        >> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
        >>>>>  org.apache.drill.exec.work.foreman.Foreman.runSQL():590
        >>>>>  org.apache.drill.exec.work.foreman.Foreman.run():275
        >>>>>  java.util.concurrent.ThreadPoolExecutor.runWorker():1142
        >>>>>  java.util.concurrent.ThreadPoolExecutor$Worker.run():617
        >>>>>  java.lang.Thread.run():745 (state=,code=0)
        >>>> 
        >>>> 
        >>>> 
        >>> 
        >>> 
        >>> 
        >>> 
        >> 
        >> 
        >> 
        >> 
        >> 
        > 
        > 
        
        
    
    

Reply via email to