So I started having the issue again. and was able to capture the verbose
stack dump. But in trouble shooting it, I found I have more questions then
before.
Basically, I had another offending
file...
/etl/dev/my-metadata/mysqspull/loads/2015-11-04/3d0b961086b7_my_load_1446660723.json
When I removed that file from directory I am doing my load in (2015-11-04)
the load worked fine.
I was like WOO this is a file issue.
Then I put it into a new directory named bad
and did the same CREATE TABLE but to a different destination and from the
dir0 = 'bad'
And it too worked with out issue!
This confused me because this indicates to me that it's not something about
the data that's messing up Drill, its maybe how much data? Or how many
files I am looking at?
Any thought on the stacktrace or anything you see would be helpful. I am
stumped.
Error with Verbose Stack dump
> ALTER SESSION set `store.json.all_text_mode` = true;
+-------+------------------------------------+
| ok | summary |
+-------+------------------------------------+
| true | store.json.all_text_mode updated. |
+-------+------------------------------------+
1 row selected (0.201 seconds)
> CREATE TABLE dfs.dev.`my-metadata/.2015-11-04` as
. . . . . . . . . . . . . . . . . . . . . . .> (
. . . . . . . . . . . . . . . . . . . . . . .> select
. . . . . . . . . . . . . . . . . . . . . . .> cast(total as int) as total,
sha1, sha256, myhash, cast(`timestamp` as bigint) as `timestamp`, tags,
. . . . . . . . . . . . . . . . . . . . . . .> link, cast(positives as int)
as positives, cast(positives_delta as int) as positives_delta, ssdeep,
cast(size as int) as size, type, report,
. . . . . . . . . . . . . . . . . . . . . . .> cast(first_seen as
timestamp) as `first_seen`, md5, cast(last_seen as timestamp) as
`last_seen`,
. . . . . . . . . . . . . . . . . . . . . . .> name, source_country,
source_id
. . . . . . . . . . . . . . . . . . . . . . .> from
dfs.etldev.`my-metadata/mysqspull/loads/` where dir0 = '2015-11-04'
. . . . . . . . . . . . . . . . . . . . . . .> );
Error: DATA_READ ERROR: Error parsing JSON - index: 16320, length: 4
(expected: range(0, 4096))
File
/etl/dev/my-metadata/mysqspull/loads/2015-11-04/3d0b961086b7_my_load_1446660723.json
Record 4081
Fragment 1:10
[Error Id: cf211a0a-2a1d-4860-bc1a-d9ff2657f973 on node2:31010]
(java.lang.IndexOutOfBoundsException) index: 16320, length: 4 (expected:
range(0, 4096))
io.netty.buffer.DrillBuf.checkIndexD():189
io.netty.buffer.DrillBuf.chk():211
io.netty.buffer.DrillBuf.getInt():491
org.apache.drill.exec.vector.UInt4Vector$Accessor.get():364
org.apache.drill.exec.vector.complex.BaseRepeatedValueVector$BaseRepeatedMutator.startNewValue():237
org.apache.drill.exec.vector.complex.impl.RepeatedVarCharWriterImpl.setPosition():157
org.apache.drill.exec.vector.complex.impl.SingleListWriter.varChar():768
org.apache.drill.exec.vector.complex.fn.JsonReader.handleString():463
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataAllText():554
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataAllText():389
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataAllText():393
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():239
org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector():179
org.apache.drill.exec.vector.complex.fn.JsonReader.write():145
org.apache.drill.exec.store.easy.json.JSONRecordReader.next():181
org.apache.drill.exec.physical.impl.ScanBatch.next():183
org.apache.drill.exec.record.AbstractRecordBatch.next():104
org.apache.drill.exec.record.AbstractRecordBatch.next():94
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():147
org.apache.drill.exec.record.AbstractRecordBatch.next():104
org.apache.drill.exec.record.AbstractRecordBatch.next():94
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():147
org.apache.drill.exec.record.AbstractRecordBatch.next():104
org.apache.drill.exec.record.AbstractRecordBatch.next():94
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
org.apache.drill.exec.record.AbstractRecordBatch.next():147
org.apache.drill.exec.physical.impl.BaseRootExec.next():83
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
org.apache.drill.exec.physical.impl.BaseRootExec.next():73
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():258
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():252
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1566
org.apache.drill.exec.work.fragment.FragmentExecutor.run():252
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
On Wed, Nov 4, 2015 at 1:04 PM, John Omernik <[email protected]> wrote:
> No I don't think so. I am running Drill in Marathon on Mesos, so my
> startup settings are all very static. In addition, the only session
> variable I was changed was the json as text option at the session level and
> I was setting it on both the pre drillbit reboot and the post drillbit
> reboot sessions (I need that to query the data).
>
> On Wed, Nov 4, 2015 at 12:46 PM, Abdel Hakim Deneche <
> [email protected]> wrote:
>
>> This is strange indeed. The error message you reported earlier doesn't
>> suggest a memory leak issue but rather a bug when reading a specific set
>> of
>> data.
>> Could it be that you changed some session options, and you forgot to set
>> them again after you restarted the drillbits ?
>>
>> Thanks
>>
>> On Wed, Nov 4, 2015 at 10:37 AM, John Omernik <[email protected]> wrote:
>>
>> > So I pulled the (I was up to two) files that seemed to be causing this
>> > issue out, and loaded my data. (see my other posts on how I did that
>> with
>> > loading into a folder prefixed by .)
>> >
>> > Anywho, my Drill cluster became unstable in general, and I was not able
>> to
>> > run any queries until I bounced by drill bits.
>> >
>> > I did that, got my process working again, and went to go try
>> > troubleshooting this problem again and everything appears to be working
>> > well now. I am stumped. Could a memory leak have caused that error
>> only
>> > on some files? I am monitoring now to determine if the problem starts
>> > again, but that is REALLY strange to me. This seems out of character for
>> > Drill, both in my use of it, and in how it handles memory has been
>> > explained to me. If I get the error again, I'll ensure I set that to
>> get a
>> > full stack trace.
>> >
>> > John
>> >
>> > On Wed, Nov 4, 2015 at 12:13 PM, Abdel Hakim Deneche <
>> > [email protected]>
>> > wrote:
>> >
>> > > The error message "index: 9604, length: 4 (expected: range(0, 8192))"
>> > > suggests an error happened when Drill tried to access a memory buffer
>> > (most
>> > > likely while writing an int or float value)
>> > > This may be a bug actually exposed by that particular data record.
>> > >
>> > > You can try enabling verbose error logging before running the query
>> > again:
>> > >
>> > > set `exec.errors.verbose`=true;
>> > >
>> > > This should give us a nice stack trace about this error.
>> > >
>> > > Thanks
>> > >
>> > > On Wed, Nov 4, 2015 at 7:29 AM, John Omernik <[email protected]>
>> wrote:
>> > >
>> > > > There are multiple fields in that record, including two lists. Both
>> > lists
>> > > > have data in them (now I am runnning with json text mode because at
>> > times
>> > > > the first value is a JSON null, but in these cases, that should be
>> > turned
>> > > > to "null" as string. (If I am understanding things correctly) and
>> > > > shouldn't be causing a problem.
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Nov 4, 2015 at 9:21 AM, Hsuan Yi Chu <[email protected]>
>> > > wrote:
>> > > >
>> > > > > What is the data type for that record in line 2402? A list?
>> > > > >
>> > > > > Do you think it could be similar to this issue ?
>> > > > >
>> > > > > https://issues.apache.org/jira/browse/DRILL-4006
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Nov 4, 2015 at 6:48 AM, John Omernik <[email protected]>
>> > wrote:
>> > > > >
>> > > > > > Hey all,
>> > > > > >
>> > > > > > I am working with JSON that is on the whole fairly clean. I am
>> > > trying
>> > > > to
>> > > > > > load into Parquet files, and the previous days worth of data
>> worked
>> > > > just
>> > > > > > fine, but todays data has something wrong with it and I Can't
>> > figure
>> > > > out
>> > > > > > what it is. Unfortunately, I can't post the data, which I know
>> > makes
>> > > > this
>> > > > > > hard to troubleshoot for the community. Hopefully I can provide
>> > some
>> > > > info
>> > > > > > here, and get some pointers on where to look, and then report
>> back
>> > on
>> > > > how
>> > > > > > we could potentially improve the error messages.
>> > > > > >
>> > > > > > The error is below.
>> > > > > >
>> > > > > >
>> > > > > > I am looking to figure out given the information reported where
>> I'd
>> > > > look
>> > > > > to
>> > > > > > trouble shoot this. Obviously the file
>> > > > > 02ffc306e877_my_load_1446640931.json
>> > > > > > is where I am looking to start
>> > > > > >
>> > > > > > This file has 3000 lines (records of data, so it's somewhere in
>> > > > between.
>> > > > > >
>> > > > > > The index/length/expected range don't mean anything to me I
>> could
>> > use
>> > > > > some
>> > > > > > help there, because I am not even sure what I am looking for.
>> > > > > >
>> > > > > > The record and/or Fragment... do those help me dig in?
>> > > > > >
>> > > > > > Since this is one record per line, I went to line 2402 but that
>> > > record
>> > > > > > looks completely normal to me, (like all the other ones) but
>> since
>> > > this
>> > > > > is
>> > > > > > dense text, I am obviously missing something, but is the record
>> the
>> > > > line
>> > > > > > number?
>> > > > > >
>> > > > > > Any other pointers I can use to trouble shoot this?
>> > > > > >
>> > > > > > Thanks!
>> > > > > >
>> > > > > > Error:
>> > > > > >
>> > > > > >
>> > > > > > Caused by:
>> org.apache.drill.common.exceptions.UserRemoteException:
>> > > > > > DATA_READ ERROR: Error parsing JSON - index: 9604, length: 4
>> > > (expected:
>> > > > > > range(0, 8192))
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > File
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /etl/dev/my-metadata/mysqspull/loads/2015-11-04/02ffc306e877_my_load_1446640931.json
>> > > > > >
>> > > > > > Record 2402
>> > > > > >
>> > > > > > Fragment 1:5
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Abdelhakim Deneche
>> > >
>> > > Software Engineer
>> > >
>> > > <http://www.mapr.com/>
>> > >
>> > >
>> > > Now Available - Free Hadoop On-Demand Training
>> > > <
>> > >
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>>
>> Abdelhakim Deneche
>>
>> Software Engineer
>>
>> <http://www.mapr.com/>
>>
>>
>> Now Available - Free Hadoop On-Demand Training
>> <
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> >
>>
>
>