Matúš Raček created DRILL-7819:
----------------------------------
Summary: Unable to query data saved in S3 via hive standalone
service
Key: DRILL-7819
URL: https://issues.apache.org/jira/browse/DRILL-7819
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Matúš Raček
Hi,
I am implementing a POC of a data processing platform.
We are using a hive-metastore-standalone service which is using a MinIO bucket
as a warehouse (s3a://spark/warehouse).
We have a connected prestoSQL via which we are able to create a table, query
it, etc.
But here comes a problem. I have a configured drill hive connector
```
{
"type": "hive",
"configProps": {
"hive.metastore.uris": "thrift://hive-metastore:9083",
"hive.metastore.sasl.enabled": "false",
"fs.s3a.access.key": "minion",
"fs.s3a.secret.key": "minio123"
},
"enabled": true
}
```
He is able to connect to hive metastore and return a list of tables that were
created via presto.
But it is unable to query it.
If I try to query the presto table as `select * from test;` then some exception
occurs.
The error from logs:
```
Please, refer to logs for more information.
[Error Id: 0f96e91e-c5f9-4aa5-8de1-a109c2130d0c on 5aee4cd314b5:31010]
at
org.apache.drill.exec.server.rest.RestQueryRunner.submitQuery(RestQueryRunner.java:181)
at
org.apache.drill.exec.server.rest.RestQueryRunner.run(RestQueryRunner.java:70)
at
org.apache.drill.exec.server.rest.QueryResources.submitQueryJSON(QueryResources.java:96)
at
org.apache.drill.exec.server.rest.QueryResources.submitQuery(QueryResources.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
at
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
at
org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780)
at
org.apache.drill.exec.server.rest.header.ResponseHeadersSettingFilter.doFilter(ResponseHeadersSettingFilter.java:71)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at
org.apache.drill.exec.server.rest.CsrfTokenValidateFilter.doFilter(CsrfTokenValidateFilter.java:55)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at
org.apache.drill.exec.server.rest.CsrfTokenInjectFilter.doFilter(CsrfTokenInjectFilter.java:54)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:539)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected
exception during fragment initialization: Error while applying rule
Prel.ScanPrule, args [rel#812:DrillScanRel.LOGICAL.ANY([]).[](table=[hivejdbc,
test],groupscan=HiveScan [table=Table(dbName:default, tableName:test),
columns=[`name`], numPartitions=0, partitions= null,
inputDirectories=[s3a://spark/warehouse/test], confProperties={}])]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:301)
at .......(:0)
Caused by: java.lang.RuntimeException: Error while applying rule
Prel.ScanPrule, args [rel#812:DrillScanRel.LOGICAL.ANY([]).[](table=[hivejdbc,
test],groupscan=HiveScan [table=Table(dbName:default, tableName:test),
columns=[`name`], numPartitions=0, partitions= null,
inputDirectories=[s3a://spark/warehouse/test], confProperties={}])]
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:235)
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:633)
at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:327)
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:405)
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:436)
at
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:174)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128)
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93)
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593)
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:274)
... 1 common frames omitted
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed to
get InputSplits
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getInputSplits(HiveMetadataProvider.java:185)
at org.apache.drill.exec.store.hive.HiveScan.getInputSplits(HiveScan.java:291)
at
org.apache.drill.exec.store.hive.HiveScan.getMaxParallelizationWidth(HiveScan.java:200)
at org.apache.drill.exec.planner.physical.ScanPrule.onMatch(ScanPrule.java:42)
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208)
... 12 common frames omitted
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed to
create input splits: s3a://spark/warehouse/test: getFileStatus on
s3a://spark/warehouse/test: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: BA1B7458A34E7E73; S3 Extended Request ID:
yDEg6FCn80j8Jp6lKkjrM5iTpsKYv+aYVX3Ab8i/CcfKtzEcVGjJrk0sc1G+V4dP2owME8PGZlE=),
S3 Extended Request ID:
yDEg6FCn80j8Jp6lKkjrM5iTpsKYv+aYVX3Ab8i/CcfKtzEcVGjJrk0sc1G+V4dP2owME8PGZlE=:403
Forbidden
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:282)
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getTableInputSplits(HiveMetadataProvider.java:145)
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.getInputSplits(HiveMetadataProvider.java:176)
... 16 common frames omitted
Caused by: java.lang.Exception: s3a://spark/warehouse/test: getFileStatus on
s3a://spark/warehouse/test: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: BA1B7458A34E7E73; S3 Extended Request ID:
yDEg6FCn80j8Jp6lKkjrM5iTpsKYv+aYVX3Ab8i/CcfKtzEcVGjJrk0sc1G+V4dP2owME8PGZlE=),
S3 Extended Request ID:
yDEg6FCn80j8Jp6lKkjrM5iTpsKYv+aYVX3Ab8i/CcfKtzEcVGjJrk0sc1G+V4dP2owME8PGZlE=:403
Forbidden
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:230)
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:151)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2239)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2204)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2143)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1683)
at org.apache.hadoop.fs.s3a.S3AFileSystem.exists(S3AFileSystem.java:3031)
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.lambda$splitInputWithUGI$2(HiveMetadataProvider.java:258)
at .......(:0)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:250)
... 18 common frames omitted
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden
(Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
BA1B7458A34E7E73; S3 Extended Request ID:
yDEg6FCn80j8Jp6lKkjrM5iTpsKYv+aYVX3Ab8i/CcfKtzEcVGjJrk0sc1G+V4dP2owME8PGZlE=)
(Service: null; Status Code: 0; Error Code: null; Request ID: null; S3 Extended
Request ID: null)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1640)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
at
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1271)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1290)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1287)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2224)
... 26 common frames omitted
````
The same problem occurs also with a spark created tables. But Spark is able to
read presto created tables and vice versa.
Direct connection to minio via a S3 connector works without a problem.
```
{
"type": "file",
"connection": "s3a://data/",
"config": {
"fs.s3a.path.style.access": "true",
"fs.s3a.connection.ssl.enabled": "false",
"fs.s3a.connection.maximum": "100",
"fs.s3a.access.key": "minio",
"fs.s3a.secret.key": "minio123",
"fs.s3a.endpoint": "http://minio:9000"
},
.....
```
There is a huge possibility that I am just missing something and I would be
very happy if you could help me
--
This message was sent by Atlassian Jira
(v8.3.4#803005)