Re: ***UNCHECKED*** Re: Query Error on PCAP over MapR FS
This stack trace makes it clear that this is a bug in the PCAP decoder caused by a misunderstanding of how to force large files to be read in one batch on a single drillBit. Are there some real Drill experts out there who can provide hints about how to avoid this? On Tue, Sep 12, 2017 at 5:03 AM, Takeo Ogawarawrote: > Sorry > > I paste plain texts. > > > 2017-09-11 15:06:52,390 [BitServer-2] WARN > > o.a.d.exec.rpc.control.WorkEventBus > - A fragment message arrived but there was no registered listener for that > message: profile { > > state: FAILED > > error { > > error_id: "bbf284b6-9da4-4869-ac20-fa100eed11b9" > > endpoint { > > address: "node22" > > user_port: 31010 > > control_port: 31011 > > data_port: 31012 > > version: "1.11.0" > > } > > error_type: SYSTEM > > message: "SYSTEM ERROR: IllegalStateException: Bad magic number = > 0a0d0d0a\n\nFragment 1:200\n\n[Error Id: bbf284b6-9da4-4869-ac20-fa100eed11b9 > on node22:31010]" > > exception { > > exception_class: "java.lang.IllegalStateException" > > message: "Bad magic number = 0a0d0d0a" > > stack_trace { > > class_name: "com.google.common.base.Preconditions" > > file_name: "Preconditions.java" > > line_number: 173 > > method_name: "checkState" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.store. > pcap.decoder.PacketDecoder" > > file_name: "PacketDecoder.java" > > line_number: 84 > > method_name: "" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.store.pcap.PcapRecordReader" > > file_name: "PcapRecordReader.java" > > line_number: 104 > > method_name: "setup" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ScanBatch" > > file_name: "ScanBatch.java" > > line_number: 104 > > method_name: "" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.store. > dfs.easy.EasyFormatPlugin" > > file_name: "EasyFormatPlugin.java" > > line_number: 166 > > method_name: "getReaderBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.store.dfs.easy. > EasyReaderBatchCreator" > > file_name: "EasyReaderBatchCreator.java" > > line_number: 35 > > method_name: "getBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.store.dfs.easy. > EasyReaderBatchCreator" > > file_name: "EasyReaderBatchCreator.java" > > line_number: 28 > > method_name: "getBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 156 > > method_name: "getRecordBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 179 > > method_name: "getChildren" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 136 > > method_name: "getRecordBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 179 > > method_name: "getChildren" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 136 > > method_name: "getRecordBatch" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 179 > > method_name: "getChildren" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 109 > > method_name: "getRootExec" > > is_native_method: false > > } > > stack_trace { > > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > > file_name: "ImplCreator.java" > > line_number: 87 > > method_name:
Re: ***UNCHECKED*** Re: Query Error on PCAP over MapR FS
On Tue, Sep 12, 2017 at 4:53 AM, Takeo Ogawarawrote: > > > > Is it absolutely required to query large files like this? Would it be > > acceptable to split the file first by making a quick scan over it? > No,loading large file isn’t necessarily required. > In fact, this large PCAP file is created by concatenating small PCAP files > with mergecap command. > So there is no problem with input small PCAP files into Drill. > > How can I analyze numbers of PCAP files together? > Simply specify a directory instead of a file. If the directory contains PCAP files, then you will query those files as if they are one table. You can also specify wildcard to allow you to query just some files.
Re: ***UNCHECKED*** Re: Query Error on PCAP over MapR FS
Sorry I paste plain texts. > 2017-09-11 15:06:52,390 [BitServer-2] WARN > o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there > was no registered listener for that message: profile { > state: FAILED > error { > error_id: "bbf284b6-9da4-4869-ac20-fa100eed11b9" > endpoint { > address: "node22" > user_port: 31010 > control_port: 31011 > data_port: 31012 > version: "1.11.0" > } > error_type: SYSTEM > message: "SYSTEM ERROR: IllegalStateException: Bad magic number = > 0a0d0d0a\n\nFragment 1:200\n\n[Error Id: bbf284b6-9da4-4869-ac20-fa100eed11b9 > on node22:31010]" > exception { > exception_class: "java.lang.IllegalStateException" > message: "Bad magic number = 0a0d0d0a" > stack_trace { > class_name: "com.google.common.base.Preconditions" > file_name: "Preconditions.java" > line_number: 173 > method_name: "checkState" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.store.pcap.decoder.PacketDecoder" > file_name: "PacketDecoder.java" > line_number: 84 > method_name: "" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.store.pcap.PcapRecordReader" > file_name: "PcapRecordReader.java" > line_number: 104 > method_name: "setup" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ScanBatch" > file_name: "ScanBatch.java" > line_number: 104 > method_name: "" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin" > file_name: "EasyFormatPlugin.java" > line_number: 166 > method_name: "getReaderBatch" > is_native_method: false > } > stack_trace { > class_name: > "org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator" > file_name: "EasyReaderBatchCreator.java" > line_number: 35 > method_name: "getBatch" > is_native_method: false > } > stack_trace { > class_name: > "org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator" > file_name: "EasyReaderBatchCreator.java" > line_number: 28 > method_name: "getBatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 156 > method_name: "getRecordBatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 179 > method_name: "getChildren" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 136 > method_name: "getRecordBatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 179 > method_name: "getChildren" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 136 > method_name: "getRecordBatch" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 179 > method_name: "getChildren" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 109 > method_name: "getRootExec" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.physical.impl.ImplCreator" > file_name: "ImplCreator.java" > line_number: 87 > method_name: "getExec" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.exec.work.fragment.FragmentExecutor" > file_name: "FragmentExecutor.java" > line_number: 207 > method_name: "run" > is_native_method: false > } > stack_trace { > class_name: "org.apache.drill.common.SelfCleaningRunnable" > file_name: "SelfCleaningRunnable.java" > line_number: 38 > method_name: "run" > is_native_method: false > } > stack_trace { > class_name: "..." > line_number: 0 >
Re: Query Error on PCAP over MapR FS
On Mon, Sep 11, 2017 at 11:23 AM, Takeo Ogawarawrote: > ... > > 1. Query error when cluster-name is not specified > ... > > With this setting, the following query failed. > > select * from mfs.`x.pcap` ; > > Error: DATA_READ ERROR: /x.pcap (No such file or directory) > > > > File name: /x.pcap > > Fragment 0:0 > > > > [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010] > (state=,code=0) > > But these queries passed. > > select * from mfs.root.`x.pcap` ; > > select * from mfs.`x.csv`; > > select * from mfs.root.`x.csv`; > As Andries mentioned, the problem here has to do with understanding what Drill is thinking about how paths are manipulated. Nothing to do with the PCAP capabilities. Usually, what I do is put entries into the configuration which directly point to the directory above my data, but I can't add anything Andries comment. > 2. Large PCAP file > Query on very large PCAP file (larger than 100GB) failed with following > error message. > > Error: SYSTEM ERROR: IllegalStateException: Bad magic number = 0a0d0d0a > > > > Fragment 1:169 > > > > [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010] > (state=,code=0) > > This happens even on Linux FS not MapR FS > Can you provide the stack trace from the Drillbit that hit the problem? I suspect that this has to do with splitting of the PCAP file. Normally, it is assumed that parallelism will be achieved by having lots of smaller files since it is difficult to jump into the middle of a PCAP file and get good results. Even if we disable splitting to avoid this error, you will have the complementary problem of slow queries due to single-threading. That doesn't seem very satisfactory either. A similar problem is that splitting a PCAP file pretty much requires a single-threaded read of the file in question. The read doesn't need to process very much data, but it does need to touch the whole file. Is it absolutely required to query large files like this? Would it be acceptable to split the file first by making a quick scan over it?
Re: Querying MapR-DB JSON Tables not returning results when specifying columns or CF's
Do you think this is a regression ? Can you try with Drill 1.11 ? Thanks, Padma > On Sep 11, 2017, at 10:21 AM, Andries Engelbrecht> wrote: > > Created a MapR-DB JSON table, but not able to query data specifying column or > CF’s. > > When doing a select * the data is returned. > > i.e. > > 0: jdbc:drill:> select * from dfs.maprdb.`/sdc/nycbike` b limit 1; > ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ > |_id | age | arc | avg_speed_mph | bikeid | > birth year | end station id | end station latitude | end station longitude > | end station name | gender | start station id | start station latitude > | start station longitude | start station name | start_date | > starttime | stoptime | tripduration | tripid > | usertype | station | > ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ > | 2017-04-01 00:00:58-25454 | 51.0 | 0.39 | 7.2| 25454 | > 1966.0 | 430 | 40.7014851| -73.98656928 > | York St & Jay St | M | 217 | 40.70277159 > | -73.99383605 | Old Fulton St | 2017-04-01 | 2017-04-01 > 00:00:58 | 2017-04-01 00:04:14 | 195 | 2017-04-01 00:00:58-25454 > | Subscriber | {"end station id":"430"} | > ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ > 1 row selected (0.191 seconds) > > > However trying to specify a column or CF name nothing is returned. > > Specify a column name > > 0: jdbc:drill:> select bikeid from dfs.maprdb.`/sdc/nycbike` b limit 10; > +--+ > | | > +--+ > +--+ > No rows selected (0.067 seconds) > > 0: jdbc:drill:> select b.bikeid from dfs.maprdb.`/sdc/nycbike` b limit 1; > +--+ > | | > +--+ > +--+ > No rows selected (0.062 seconds) > > > Specify a CF name the same result. > > 0: jdbc:drill:> select b.station from dfs.maprdb.`/sdc/nycbike` b limit 1; > +--+ > | | > +--+ > +--+ > No rows selected (0.063 seconds) > > > Drill 1.10 and the user has full read/write/traverse permissions on the table. > > > > > Thanks > > Andries
Re: Workaround for drill queries during node failure
Did you mean to say “we could not execute any queries” ? Need more details about configuration you have. When you say data is available on other nodes, is it because you have replication configured (assuming it is DFS) ? What exactly are you trying and what error you see when you try to execute the query ? Thanks, Padma On Sep 11, 2017, at 9:40 AM, Kshitija Shinde> wrote: Hi, We have installed drill in distributed mode. While testing drillbit we have observed that if one of node is done then we could execute any queries against the drill even if data is available on other nodes. Is there any workaround for this? Thanks, Kshitija
Querying MapR-DB JSON Tables not returning results when specifying columns or CF's
Created a MapR-DB JSON table, but not able to query data specifying column or CF’s. When doing a select * the data is returned. i.e. 0: jdbc:drill:> select * from dfs.maprdb.`/sdc/nycbike` b limit 1; ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ |_id | age | arc | avg_speed_mph | bikeid | birth year | end station id | end station latitude | end station longitude | end station name | gender | start station id | start station latitude | start station longitude | start station name | start_date | starttime | stoptime | tripduration | tripid | usertype | station | ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ | 2017-04-01 00:00:58-25454 | 51.0 | 0.39 | 7.2| 25454 | 1966.0 | 430 | 40.7014851| -73.98656928 | York St & Jay St | M | 217 | 40.70277159 | -73.99383605 | Old Fulton St | 2017-04-01 | 2017-04-01 00:00:58 | 2017-04-01 00:04:14 | 195 | 2017-04-01 00:00:58-25454 | Subscriber | {"end station id":"430"} | ++---+---++-+-+-+---++---+-+---+-+--+-+-+--+--+---++-+---+ 1 row selected (0.191 seconds) However trying to specify a column or CF name nothing is returned. Specify a column name 0: jdbc:drill:> select bikeid from dfs.maprdb.`/sdc/nycbike` b limit 10; +--+ | | +--+ +--+ No rows selected (0.067 seconds) 0: jdbc:drill:> select b.bikeid from dfs.maprdb.`/sdc/nycbike` b limit 1; +--+ | | +--+ +--+ No rows selected (0.062 seconds) Specify a CF name the same result. 0: jdbc:drill:> select b.station from dfs.maprdb.`/sdc/nycbike` b limit 1; +--+ | | +--+ +--+ No rows selected (0.063 seconds) Drill 1.10 and the user has full read/write/traverse permissions on the table. Thanks Andries
Workaround for drill queries during node failure
Hi, We have installed drill in distributed mode. While testing drillbit we have observed that if one of node is done then we could execute any queries against the drill even if data is available on other nodes. Is there any workaround for this? Thanks, Kshitija
Re: Query Error on PCAP over MapR FS
Typically when you use the MapR-FS plugin you don’t need to specify the cluster root path in the dfs workspace. Instead of "location": "/mapr/cluster3", use "location": "/", "connection": "maprfs:///", already points to the default MapR cluster root. --Andries On 9/11/17, 2:23 AM, "Takeo Ogawara"wrote: Dear all, I’m using PCAP storage plugin over MapR FS(5.2.0) with Drill(1.11.0) compiled as follows. $ mvn clean install -DskipTests -Pmapr Some queries caused errors as following. Does anyone know how to solve these errors? 1. Query error when cluster-name is not specified Storage “mfs” setting is this. > "type": "file", > "enabled": true, > "connection": "maprfs:///", > "config": null, > "workspaces": { > "root": { > "location": "/mapr/cluster3", > "writable": false, > "defaultInputFormat": null > } > } With this setting, the following query failed. > select * from mfs.`x.pcap` ; > Error: DATA_READ ERROR: /x.pcap (No such file or directory) > > File name: /x.pcap > Fragment 0:0 > > [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010] (state=,code=0) But these queries passed. > select * from mfs.root.`x.pcap` ; > select * from mfs.`x.csv`; > select * from mfs.root.`x.csv`; 2. Large PCAP file Query on very large PCAP file (larger than 100GB) failed with following error message. > Error: SYSTEM ERROR: IllegalStateException: Bad magic number = 0a0d0d0a > > Fragment 1:169 > > [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010] (state=,code=0) This happens even on Linux FS not MapR FS Thank you.
Query Error on PCAP over MapR FS
Dear all, I’m using PCAP storage plugin over MapR FS(5.2.0) with Drill(1.11.0) compiled as follows. $ mvn clean install -DskipTests -Pmapr Some queries caused errors as following. Does anyone know how to solve these errors? 1. Query error when cluster-name is not specified Storage “mfs” setting is this. > "type": "file", > "enabled": true, > "connection": "maprfs:///", > "config": null, > "workspaces": { > "root": { > "location": "/mapr/cluster3", > "writable": false, > "defaultInputFormat": null > } > } With this setting, the following query failed. > select * from mfs.`x.pcap` ; > Error: DATA_READ ERROR: /x.pcap (No such file or directory) > > File name: /x.pcap > Fragment 0:0 > > [Error Id: 70b73062-c3ed-4a10-9a88-034b4e6d039a on node21:31010] > (state=,code=0) But these queries passed. > select * from mfs.root.`x.pcap` ; > select * from mfs.`x.csv`; > select * from mfs.root.`x.csv`; 2. Large PCAP file Query on very large PCAP file (larger than 100GB) failed with following error message. > Error: SYSTEM ERROR: IllegalStateException: Bad magic number = 0a0d0d0a > > Fragment 1:169 > > [Error Id: 8882c359-c253-40c0-866c-417ef1ce5aa3 on node22:31010] > (state=,code=0) This happens even on Linux FS not MapR FS Thank you.