If the time is all spent in planning, that won't have much impact as query planning is not distributed.
It sounds like we may be taking a while to plan queries against Mongo when there are a large number of collections. Can you open a JIRA that we can take a look at? -- Jacques Nadeau CTO and Co-Founder, Dremio On Mon, Feb 29, 2016 at 10:07 PM, Rifat Mahmud <rftm...@gmail.com> wrote: > Is there a possibility that using distributed Drill will shorten the > latency > > On Tue, Mar 1, 2016 at 11:47 AM, Rifat Mahmud <rftm...@gmail.com> wrote: > > > 7 seconds on a simple select * command on a table(collection) containin > > 158 rows(documents). > > > > Yes, the mongodb database I am querying into does have 268 collections. > > > > On Tue, Mar 1, 2016 at 11:13 AM, Jacques Nadeau <jacq...@dremio.com> > > wrote: > > > >> I haven't had a chance to look at the profile in detail yet. Do you see > >> consistent behavior on multiple queries? > >> > >> Does your mongodb happen to have a large number of collections and/or > >> databases? (Just guessing here). > >> > >> > >> > >> -- > >> Jacques Nadeau > >> CTO and Co-Founder, Dremio > >> > >> On Sun, Feb 28, 2016 at 8:54 PM, Rifat Mahmud <rftm...@gmail.com> > wrote: > >> > >> > Here is the json profile of the query: http://pastebin.com/tqang1Y0 > >> > Attached the screen shot of the query profile web view too. > >> > I am using everything in default configuration for apache-drill-1.5.0, > >> > just changed the MongoDB location from localhost to the remote IP in > the > >> > storage configuration. > >> > > >> > On Sun, Feb 28, 2016 at 3:16 PM, Jacques Nadeau <jacq...@dremio.com> > >> > wrote: > >> > > >> >> Can you share what the profile looks like? Where is the time being > >> spent? > >> >> Unless something is really wrong (or these are gigabyte sized > records), > >> >> I'm > >> >> guessing there is a configuration issue or bug you are hitting. > >> >> > >> >> > >> >> -- > >> >> Jacques Nadeau > >> >> CTO and Co-Founder, Dremio > >> >> > >> >> On Sun, Feb 28, 2016 at 12:04 AM, Rifat Mahmud <rftm...@gmail.com> > >> wrote: > >> >> > >> >> > I am running embedded drill on a single 8 core, 16 GB RAM machine. > I > >> am > >> >> > performing a join query(select * from t1, t2 where t1.a = t2.b) on > a > >> >> remote > >> >> > MongoDB database. The tables(collections) contain 2 and 4 > >> >> rows(documents) > >> >> > only. The query is taking 27 seconds. > >> >> > Can the query be made faster by using drill with Zookeeper cluster? > >> And > >> >> if > >> >> > the answer is yes, by how many factors? Please, elaborate about the > >> >> > real-timeliness of Apache Drill. > >> >> > > >> >> > >> > > >> > > >> > > > > >