Right, but do you need the rest of the config at the top of the dfs default
config? Here's what I assume to be the full config taken from my 1.17 dfs
config (with other formats deleted):
{
"type": "file",
"connection": "file:///",
"config": null,
"workspaces": {
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
"formats": {
"json": {
"type": "json",
"extensions": [
"json"
]
}
},
"enabled": true
}
- Rafael
On Thu, Jul 23, 2020 at 11:37 AM Charles Givre <[email protected]> wrote:
> Rafael,
> Clark is using the filesystem plugin to query a Hadoop cluster. It seems
> weird that you can enumerate the files in a directory but when you try to
> query that file, it breaks...
> -- C
>
>
>
> > On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <[email protected]>
> wrote:
> >
> > Hi all,
> >
> > It looks like the file is 644 already which should be good.
> > I'm confused why the schema is called hdfs. dfs is a pre-built schema for
> > HDFS and querying against flat files such as .json as you're trying to
> do.
> > The default config for dfs also has a lot more content than what you
> > pasted. Can you use the default and try again?
> >
> > Hope this helps,
> > Rafael
> >
> >
> > On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <[email protected]> wrote:
> >
> >> Hi Clark,
> >> That's strange. My initial thought is that this could be a permission
> >> issue. However, it might also be that Drill isn't finding the file for
> >> some reason.
> >>
> >> Could you try:
> >>
> >> SELECT *
> >> FROM hdfs.`<full hdfs path to file>`
> >>
> >> Best,
> >> --- C
> >>
> >>
> >>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <[email protected]>
> >> wrote:
> >>>
> >>> This is in 1.17. I can use SHOW FILES to list the file I'm targeting,
> >> but I cannot query it:
> >>>
> >>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> | name | isDirectory | isFile | length | owner | group
> >> | permissions | accessTime | modificationTime |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> | employee.json | false | true | 474630 | me | supergroup
> >> | rw-r--r-- | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> 1 row selected (3.039 seconds)
> >>>
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
> >>>
> >>> The storage plugin uses the standard json config:
> >>>
> >>> "json": {
> >>> "type": "json",
> >>> "extensions": [
> >>> "json"
> >>> ]
> >>> },
> >>>
> >>> I can't see any problems on the HDFS side. Full stack trace is below.
> >>>
> >>> Any ideas what could be causing this behavior?
> >>>
> >>> Thanks, Clark
> >>>
> >>>
> >>>
> >>> FULL STACKTRACE:
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>>
> >>>
> >>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
> >>>
> >>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
> >> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
> >> within 'hdfs.root'
> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>> java.lang.reflect.Constructor.newInstance():423
> >>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>> org.apache.calcite.sql.SqlUtil.newContextException():824
> >>> org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>> org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>> org.apache.drill.exec.work.foreman.Foreman.run():275
> >>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>> java.lang.Thread.run():745
> >>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>> java.lang.reflect.Constructor.newInstance():423
> >>> org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>> org.apache.calcite.runtime.Resources$ExInst.ex():572
> >>> org.apache.calcite.sql.SqlUtil.newContextException():824
> >>> org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>> org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>> org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>> org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>> org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>> org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>> org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>> org.apache.drill.exec.work.foreman.Foreman.run():275
> >>> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>> java.lang.Thread.run():745 (state=,code=0)
> >>
> >>
>
>