That seems to be allowing longer queries. (timeframe wise).   Interesting,
to_date(dir0) no issues, dir0 heap space issues.... is this expected? Is
this a minor bug (that would need a JIRA?)




On Tue, Mar 1, 2016 at 11:32 AM, Jason Altekruse <[email protected]>
wrote:

> To_date takes a long assumed to be a unix timestamp, so the error you are
> getting here is from an implicit cast trying to turn the string into a long
> before converting it to a date. You can provide a second parameter to tell
> it how you would like to parse your string to properly parse these kinds of
> dates.
>
> https://drill.apache.org/docs/data-type-conversion/#to_date
>
> On Tue, Mar 1, 2016 at 9:17 AM, John Omernik <[email protected]> wrote:
>
> > In the view I have select to_date(dir0) as sub_date...
> >
> > When I run a query, I am getting "Error: SYSTEM ERROR:
> > NumberFormatException: 2015-11-12"
> >
> > *even though I am using a where sub_date >= '2016-02-20'  although I
> think
> > this has to do with the planning slowness I've spoken about
> >
> >
> >
> > On Tue, Mar 1, 2016 at 10:14 AM, Jacques Nadeau <[email protected]>
> > wrote:
> >
> > > In the view.
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Tue, Mar 1, 2016 at 6:02 AM, John Omernik <[email protected]> wrote:
> > >
> > > > In the view or in the query?
> > > >
> > > > On Mon, Feb 29, 2016 at 9:05 PM, Jacques Nadeau <[email protected]>
> > > > wrote:
> > > >
> > > > > Can you try to convert src_date to a date type?
> > > > >
> > > > > --
> > > > > Jacques Nadeau
> > > > > CTO and Co-Founder, Dremio
> > > > >
> > > > > On Mon, Feb 29, 2016 at 10:28 AM, John Omernik <[email protected]>
> > > wrote:
> > > > >
> > > > > > I am running 6 drill bits, they were running with 20GB of Direct
> > > Memory
> > > > > and
> > > > > > 4 GB of Heap, and I altered them to run with 18GB of direct and 6
> > GB
> > > of
> > > > > > Heap, and I am still getting this error.
> > > > > >
> > > > > > I am running a query, and trying to understand why so much heap
> > space
> > > > is
> > > > > > being used. The data is Parquet files, organized into directories
> > by
> > > > date
> > > > > > (2015-01-01, 2015-01-02 etc)
> > > > > >
> > > > > > TABLE
> > > > > > ---> 2015-01-01
> > > > > > ---> 2015-01-02
> > > > > >
> > > > > > Etc
> > > > > >
> > > > > > This data isn't what I would call "huge", at most 500 MB per day,
> > > with
> > > > 69
> > > > > > parquet files per day.  While I do have the planning issue
> related
> > to
> > > > > lots
> > > > > > of directories with lots of files, (see other emails) I don't
> think
> > > > that
> > > > > is
> > > > > > related here.
> > > > > >
> > > > > > I have a view that basically select dir0 as src_date, field1,
> > field2,
> > > > > > field3 from table, then I run a query such as
> > > > > >
> > > > > > select src_date, count(1) from view_table where src_date >=
> > > > '2016-02-25'
> > > > > > group by src_date
> > > > > >
> > > > > > That will work.
> > > > > >
> > > > > > If I run
> > > > > >
> > > > > > select src_date, count(1) from view_table where src_date >=
> > > > '2016-02-01'
> > > > > > group by src_date
> > > > > >
> > > > > > That will hang, and eventually I will see drillbit crash and
> > restart
> > > > and
> > > > > > the errors logs point to Java Heap Space issues.  This is the
> same
> > > on 4
> > > > > GB
> > > > > > or 6 GB HEAP Space.
> > > > > >
> > > > > > So my question is this...
> > > > > >
> > > > > > Given the data, how do I troubleshoot this and provide helpful
> > > > feedback?
> > > > > I
> > > > > > am running the MapR 1.4 Developer Release right now, this to me
> > seems
> > > > to
> > > > > be
> > > > > > an issue in that why would a single query be able to crash a
> node?
> > > > > > SHouldn't the query be terminated? Even so, why would 30 days of
> > > 500mb
> > > > of
> > > > > > data (i.e. it would take 15 GB of direct ram per node, which is
> > > > > available,
> > > > > > to load the ENTIRE DATA set into ram) crash given that sort of
> > > > > aggregation?
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to