Created an issue: https://issues.apache.org/jira/browse/DRILL-4462
It seems like number of collection doesn't matter. The query takes same
amount of time(around 27 seconds) on database containing only 2 collection.

On Tue, Mar 1, 2016 at 10:17 PM, Jacques Nadeau <[email protected]> wrote:

> If the time is all spent in planning, that won't have much impact as query
> planning is not distributed.
>
> It sounds like we may be taking a while to plan queries against Mongo when
> there are a large number of collections. Can you open a JIRA that we can
> take a look at?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Feb 29, 2016 at 10:07 PM, Rifat Mahmud <[email protected]> wrote:
>
> > Is there a possibility that using distributed Drill will shorten the
> > latency
> >
> > On Tue, Mar 1, 2016 at 11:47 AM, Rifat Mahmud <[email protected]> wrote:
> >
> > > 7 seconds on a simple select * command on a table(collection) containin
> > > 158 rows(documents).
> > >
> > > Yes, the mongodb database I am querying into does have 268 collections.
> > >
> > > On Tue, Mar 1, 2016 at 11:13 AM, Jacques Nadeau <[email protected]>
> > > wrote:
> > >
> > >> I haven't had a chance to look at the profile in detail yet. Do you
> see
> > >> consistent behavior on multiple queries?
> > >>
> > >> Does your mongodb happen to have a large number of collections and/or
> > >> databases? (Just guessing here).
> > >>
> > >>
> > >>
> > >> --
> > >> Jacques Nadeau
> > >> CTO and Co-Founder, Dremio
> > >>
> > >> On Sun, Feb 28, 2016 at 8:54 PM, Rifat Mahmud <[email protected]>
> > wrote:
> > >>
> > >> > Here is the json profile of the query: http://pastebin.com/tqang1Y0
> > >> > Attached the screen shot of the query profile web view too.
> > >> > I am using everything in default configuration for
> apache-drill-1.5.0,
> > >> > just changed the MongoDB location from localhost to the remote IP in
> > the
> > >> > storage configuration.
> > >> >
> > >> > On Sun, Feb 28, 2016 at 3:16 PM, Jacques Nadeau <[email protected]
> >
> > >> > wrote:
> > >> >
> > >> >> Can you share what the profile looks like? Where is the time being
> > >> spent?
> > >> >> Unless something is really wrong (or these are gigabyte sized
> > records),
> > >> >> I'm
> > >> >> guessing there is a configuration issue or bug you are hitting.
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Jacques Nadeau
> > >> >> CTO and Co-Founder, Dremio
> > >> >>
> > >> >> On Sun, Feb 28, 2016 at 12:04 AM, Rifat Mahmud <[email protected]>
> > >> wrote:
> > >> >>
> > >> >> > I am running embedded drill on a single 8 core, 16 GB RAM
> machine.
> > I
> > >> am
> > >> >> > performing a join query(select * from t1, t2 where t1.a = t2.b)
> on
> > a
> > >> >> remote
> > >> >> > MongoDB database. The tables(collections) contain 2 and 4
> > >> >> rows(documents)
> > >> >> > only. The query is taking 27 seconds.
> > >> >> > Can the query be made faster by using drill with Zookeeper
> cluster?
> > >> And
> > >> >> if
> > >> >> > the answer is yes, by how many factors? Please, elaborate about
> the
> > >> >> > real-timeliness of Apache Drill.
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to