Hi Navin,
You had mentioned your ECS solution in an earlier note. What are you using to
access data in your container? Is your ECS container running HDFS? Or, do you
have some other API?
Do you have Drill running in a container on ECS, or is that were your data is
located? It would be helpful if you could perhaps describe your setup in a bit
more detail so we can offer suggestions about where to look for an issue.
By the way: the query profile is often a good place to start. You'll find them
in the Drill Web Console. Looking at each operator you can see how much memory
was used and how long things took. Specifically, look at the time taken by the
scan: is the slowness due to reading the data, or is some other part of the
query taking the time?
When you get the error, what is the stack trace? Is the error coming from some
particular HDFS client? In some particular operation?
Thanks,
- Paul
On Friday, March 27, 2020, 6:59:42 AM PDT, Navin Bhawsar
<[email protected]> wrote:
Hi,
We are facing performance issue where apache drill query on ecs time out
with below error "ConnectionPoolTimeoutException: Timeout waiting for
connection from pool"
However same query works fine on hdfs single node with execution time of
2.1 sec.(planning =.483s)
Parquet file size <1.5 GB
Total parquet files scanned = 8( total 19 in directory)
Apache drill version 1.17
JDK 1.8.0_74
Total rows returned from query =71000
There are 2 drillbits running in distributed mode .
13 GB default allocated per drill bit.
Any ideas why ecs performance so bad when compared with hdfs for drill ?
Please advise if drill provides options to optimize ecs querying .
Please let me know if you need more details.
Thanks & Regards,
Navin