Hi Marc, I don't think any of the core team has used MinIO. Sounds like you are running Drill in Docker. So, the first question to others is: is anyone using Drill in Docker against plain old S3? If someone is, and does not hit the delay issues you describe, then we can narrow down the problem to something in your environment.
Can you test MinIO without Drill? Maybe create a Docker container with the AWS client, ssh into the container, and use the command line tools to download your nation.parquet file, Check if you also encounter the delay. If so, this tells us there is an environment issue. If the command line is fast, then perhaps we have a Drill issue. The next step would be to enable logging. I don't know if we have detailed logging around file actions (open, read, close); so we'll have to check to see if logging will give us the detail we need. Thanks, - Paul On Tuesday, January 28, 2020, 4:04:07 AM PST, Marc Sole Fonte <ms...@iti.es> wrote: Hello, I m currently trying to use Drill to query MinIO (S3 API) but I am having a lot of problems related to the time it takes (I got a lot of timeouts). Both services (one instance each) are running in docker in my local computer. The problem is that the first query takes like 40+ seconds and, after it has finished, it takes less than 1 second. I am querying a very small parquet file. As an instance, these are two queries that I executed. The first query planning took 27.08 seconds: 01/10/2020 13:42:04 anonymous SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 5.421 sec 0eff029cf8dc 01/10/2020 13:37:27 anonymous SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 33.508 sec 0eff029cf8dc This is not an isolated case. It happens everytime I try to use it. I run a new docker clean image each time. Also, if I try to execute the same query multiple times (because of timeout) I get the same problem till the first query (48.296s planning in this case) finishes. Some times I even get slow queries after thath (3+ seconds). 01/28/2020 10:33:59 anonymous <http://localhost:9000/profiles/21cff1e7-a8a8-6128-d329-ac369bd69c32> SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 3.494 sec 86acfa9818e1 01/28/2020 10:21:14 anonymous <http://localhost:9000/profiles/21cff4e5-fe98-2db8-c617-d15c96470235> SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 4.595 sec 86acfa9818e1 01/28/2020 10:20:33 anonymous <http://localhost:9000/profiles/21cff50d-ae9c-1629-f80f-db5c3a253762> SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 31.801 sec 86acfa9818e1 01/28/2020 10:20:16 anonymous <http://localhost:9000/profiles/21cff51e-e55f-456e-c399-51289fadb77a> SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 49.098 sec 86acfa9818e1 01/28/2020 10:20:03 anonymous <http://localhost:9000/profiles/21cff52e-2792-79e8-48b5-f258e6efb02b> SELECT N_NAME as COUNTRY FROM minio_jupyter.`nation.parquet` WHERE N_REGIONKEY = 2 Succeeded 01 min 2.494 sec 86acfa9818e1 Thank you for your help, Marc