Also, I just validated that without changing the data, I went from
initially being able to run the query, and as more data is added, at some
point the queries start failing with the IOOB error, then as that happens
eventually the drillbits run out of memory and need to be restarted. At
that point, the queries start working again. (Again, this is without
changing the data).

Apparently something in what is happening is causing a leak in memory that
is not being cleared.  I will happily try the 1.3.0 release once I can get
a MapR package, and hopefully this addresses the issue, if not, I we'll
have alternative troubleshooting to do.  (Is there a way to monitor memory?
I would assume it's on the metrics page, but are there some that I should
be paying attention to? There are lots of fields there, and I am not sure
what they all mean, especially some return negative numbers which is
confusing.)

John

On Thu, Nov 5, 2015 at 12:27 PM, John Omernik <[email protected]> wrote:

> Abdel -
>
> Thank you, I do understand it's a challenge for troubleshooting, and
> apologize to that end. I see you have a @maprtech email, is the binaries in
> the release built with the MapRDB support? I need that for my mapr cluster,
> that's why I am waiting for a MapR build of 1.3.0.
>
> On Thu, Nov 5, 2015 at 11:44 AM, Abdel Hakim Deneche <
> [email protected]> wrote:
>
>> Hey John,
>>
>> If you want to, you can download the binaries for 1.3 release candidate
>> from [1] and see if you can reproduce the error. You just need to unzip
>> the
>> folder and run "bin/drill-embedded".
>>
>> Without some data to reproduce the issue, it's really hard to come up with
>> an explanation.
>>
>> Thanks
>>
>> [1] http://people.apache.org/~jacques/apache-drill-1.3.0.rc0/
>>
>> On Thu, Nov 5, 2015 at 5:12 AM, John Omernik <[email protected]> wrote:
>>
>> > Hey Steven, I will look into that.  Based on your understanding of the
>> > problem would DRILL-4006 still apply given these conditions
>> >
>> > 1. When I query a directory of json files, and it fails signaling a
>> > specific JSON file as a culprit. When I remove that file, it works, and
>> > when I do a query only on that culprit JSON file it works as well.
>> > 2. When the error occurs, if I restart my drill bits, and run the query
>> > again it seems to work (This one baffles me)
>> >
>> > I will look to try the 1.3 release, I am using 1.2.1 release from MapR,
>> so
>> > I may have to wait until they roll a package for easy install (I want to
>> > include their MapRDB Support).
>> >
>> > MapR Team: If you have a current release with the Drill 4006
>> incorporated
>> > and possibly the JDBC Storage Plugin fixes rolled for testing, I'd love
>> to
>> > give it a shot (non-supported of course)
>> >
>> >
>> >
>> > On Thu, Nov 5, 2015 at 12:48 AM, Steven Phillips <[email protected]>
>> > wrote:
>> >
>> > > This looks like DRILL-4006, a fix for which just went in.
>> > >
>> > > https://issues.apache.org/jira/browse/DRILL-4006
>> > >
>> > >
>> > > On Wed, Nov 4, 2015 at 12:16 PM, John Omernik <[email protected]>
>> wrote:
>> > >
>> > > > I am on MapR's 1.2.1 Package.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Nov 4, 2015 at 2:14 PM, Abdel Hakim Deneche <
>> > > [email protected]
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > One last thing, what version of Drill do you have installed ?
>> > > > >
>> > > > > On Wed, Nov 4, 2015 at 11:04 AM, John Omernik <[email protected]>
>> > > wrote:
>> > > > >
>> > > > > > No I don't think so.  I am running Drill in Marathon on Mesos,
>> so
>> > my
>> > > > > > startup settings are all very static. In addition, the only
>> session
>> > > > > > variable I was changed was the json as text option at the
>> session
>> > > level
>> > > > > and
>> > > > > > I was setting it on both the pre drillbit reboot and the post
>> > > drillbit
>> > > > > > reboot sessions (I need that to query the data).
>> > > > > >
>> > > > > > On Wed, Nov 4, 2015 at 12:46 PM, Abdel Hakim Deneche <
>> > > > > > [email protected]>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > This is strange indeed. The error message you reported earlier
>> > > > doesn't
>> > > > > > > suggest a memory leak issue but rather a bug when reading a
>> > > specific
>> > > > > set
>> > > > > > of
>> > > > > > > data.
>> > > > > > > Could it be that you changed some session options, and you
>> forgot
>> > > to
>> > > > > set
>> > > > > > > them again after you restarted the drillbits ?
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > > On Wed, Nov 4, 2015 at 10:37 AM, John Omernik <
>> [email protected]>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > So I pulled the (I was up to two) files that seemed to be
>> > causing
>> > > > > this
>> > > > > > > > issue out, and loaded my data.  (see my other posts on how I
>> > did
>> > > > that
>> > > > > > > with
>> > > > > > > > loading into a folder prefixed by .)
>> > > > > > > >
>> > > > > > > > Anywho, my Drill cluster became unstable in general, and I
>> was
>> > > not
>> > > > > able
>> > > > > > > to
>> > > > > > > > run any queries until I bounced by drill bits.
>> > > > > > > >
>> > > > > > > > I did that, got my process working again, and went to go try
>> > > > > > > > troubleshooting this problem again and everything appears
>> to be
>> > > > > working
>> > > > > > > > well now.  I am stumped.   Could a memory leak have caused
>> that
>> > > > error
>> > > > > > > only
>> > > > > > > > on some files?  I am monitoring now to determine if the
>> problem
>> > > > > starts
>> > > > > > > > again, but that is REALLY strange to me. This seems out of
>> > > > character
>> > > > > > for
>> > > > > > > > Drill, both in my use of it, and in how it handles memory
>> has
>> > > been
>> > > > > > > > explained to me.  If I get the error again, I'll ensure I
>> set
>> > > that
>> > > > to
>> > > > > > > get a
>> > > > > > > > full stack trace.
>> > > > > > > >
>> > > > > > > > John
>> > > > > > > >
>> > > > > > > > On Wed, Nov 4, 2015 at 12:13 PM, Abdel Hakim Deneche <
>> > > > > > > > [email protected]>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > The error message "index: 9604, length: 4 (expected:
>> range(0,
>> > > > > 8192))"
>> > > > > > > > > suggests an error happened when Drill tried to access a
>> > memory
>> > > > > buffer
>> > > > > > > > (most
>> > > > > > > > > likely while writing an int or float value)
>> > > > > > > > > This may be a bug actually exposed by that particular data
>> > > > record.
>> > > > > > > > >
>> > > > > > > > > You can try enabling verbose error logging before running
>> the
>> > > > query
>> > > > > > > > again:
>> > > > > > > > >
>> > > > > > > > > set `exec.errors.verbose`=true;
>> > > > > > > > >
>> > > > > > > > > This should give us a nice stack trace about this error.
>> > > > > > > > >
>> > > > > > > > > Thanks
>> > > > > > > > >
>> > > > > > > > > On Wed, Nov 4, 2015 at 7:29 AM, John Omernik <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > There are multiple fields in that record, including two
>> > > lists.
>> > > > > Both
>> > > > > > > > lists
>> > > > > > > > > > have data in them (now I am runnning with json text mode
>> > > > because
>> > > > > at
>> > > > > > > > times
>> > > > > > > > > > the first value is a JSON null, but in these cases, that
>> > > should
>> > > > > be
>> > > > > > > > turned
>> > > > > > > > > > to "null" as  string.  (If I am understanding things
>> > > correctly)
>> > > > > and
>> > > > > > > > > > shouldn't be causing a problem.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Nov 4, 2015 at 9:21 AM, Hsuan Yi Chu <
>> > > > > [email protected]>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > What is the data type for that record in line 2402? A
>> > list?
>> > > > > > > > > > >
>> > > > > > > > > > > Do you think it could be similar to this issue ?
>> > > > > > > > > > >
>> > > > > > > > > > > https://issues.apache.org/jira/browse/DRILL-4006
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Nov 4, 2015 at 6:48 AM, John Omernik <
>> > > > [email protected]
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hey all,
>> > > > > > > > > > > >
>> > > > > > > > > > > > I am working with JSON that is on the whole fairly
>> > clean.
>> > > > I
>> > > > > am
>> > > > > > > > > trying
>> > > > > > > > > > to
>> > > > > > > > > > > > load into Parquet files, and the previous days
>> worth of
>> > > > data
>> > > > > > > worked
>> > > > > > > > > > just
>> > > > > > > > > > > > fine, but todays data has something wrong with it
>> and I
>> > > > Can't
>> > > > > > > > figure
>> > > > > > > > > > out
>> > > > > > > > > > > > what it is. Unfortunately, I can't post the data,
>> > which I
>> > > > > know
>> > > > > > > > makes
>> > > > > > > > > > this
>> > > > > > > > > > > > hard to troubleshoot for the community. Hopefully I
>> can
>> > > > > provide
>> > > > > > > > some
>> > > > > > > > > > info
>> > > > > > > > > > > > here, and get some pointers on where to look, and
>> then
>> > > > report
>> > > > > > > back
>> > > > > > > > on
>> > > > > > > > > > how
>> > > > > > > > > > > > we could potentially improve the error messages.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The error is below.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > I am looking to figure out given the information
>> > reported
>> > > > > where
>> > > > > > > I'd
>> > > > > > > > > > look
>> > > > > > > > > > > to
>> > > > > > > > > > > > trouble shoot this. Obviously the file
>> > > > > > > > > > > 02ffc306e877_my_load_1446640931.json
>> > > > > > > > > > > > is where I am looking to start
>> > > > > > > > > > > >
>> > > > > > > > > > > > This file has 3000 lines (records of data, so it's
>> > > > somewhere
>> > > > > in
>> > > > > > > > > > between.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The index/length/expected range don't mean anything
>> to
>> > > me I
>> > > > > > could
>> > > > > > > > use
>> > > > > > > > > > > some
>> > > > > > > > > > > > help there, because I am not even sure what I am
>> > looking
>> > > > for.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The record and/or Fragment... do those help me dig
>> in?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Since this is one record per line, I went to line
>> 2402
>> > > but
>> > > > > that
>> > > > > > > > > record
>> > > > > > > > > > > > looks completely normal to me, (like all the other
>> > ones)
>> > > > but
>> > > > > > > since
>> > > > > > > > > this
>> > > > > > > > > > > is
>> > > > > > > > > > > > dense text, I am obviously missing something, but is
>> > the
>> > > > > record
>> > > > > > > the
>> > > > > > > > > > line
>> > > > > > > > > > > > number?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Any other pointers I can use to trouble shoot this?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks!
>> > > > > > > > > > > >
>> > > > > > > > > > > > Error:
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Caused by:
>> > > > > > > org.apache.drill.common.exceptions.UserRemoteException:
>> > > > > > > > > > > > DATA_READ ERROR: Error parsing JSON - index: 9604,
>> > > length:
>> > > > 4
>> > > > > > > > > (expected:
>> > > > > > > > > > > > range(0, 8192))
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > File
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /etl/dev/my-metadata/mysqspull/loads/2015-11-04/02ffc306e877_my_load_1446640931.json
>> > > > > > > > > > > >
>> > > > > > > > > > > > Record  2402
>> > > > > > > > > > > >
>> > > > > > > > > > > > Fragment 1:5
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > >
>> > > > > > > > > Abdelhakim Deneche
>> > > > > > > > >
>> > > > > > > > > Software Engineer
>> > > > > > > > >
>> > > > > > > > >   <http://www.mapr.com/>
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Now Available - Free Hadoop On-Demand Training
>> > > > > > > > > <
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > > Abdelhakim Deneche
>> > > > > > >
>> > > > > > > Software Engineer
>> > > > > > >
>> > > > > > >   <http://www.mapr.com/>
>> > > > > > >
>> > > > > > >
>> > > > > > > Now Available - Free Hadoop On-Demand Training
>> > > > > > > <
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Abdelhakim Deneche
>> > > > >
>> > > > > Software Engineer
>> > > > >
>> > > > >   <http://www.mapr.com/>
>> > > > >
>> > > > >
>> > > > > Now Available - Free Hadoop On-Demand Training
>> > > > > <
>> > > > >
>> > > >
>> > >
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>>
>> Abdelhakim Deneche
>>
>> Software Engineer
>>
>>   <http://www.mapr.com/>
>>
>>
>> Now Available - Free Hadoop On-Demand Training
>> <
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> >
>>
>
>

Reply via email to