Hi Ted, 
I thought about that approach as well.  My concern was cluttering up the plugin 
with lots of columns, especially as we add different protocols.  However, if 
that's not a concern, I can have a go at it.  

I was thinking the same thing about the Kaitai struct.  Would it be possible to 
have some generic reader such that you provide the schema, and Drill would map 
that to columns as appropriate.  That way you could use all the formats pretty 
much instantly from the Kaitai format gallery. 



> On Apr 23, 2019, at 5:08 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
> Wow. Kaitai looks fabulous. It would be tempting to define a generic format
> that could use a kaitai spec to define the format of a file.
> 
> Regarding the map output, I think we solved the same problem in the PCAP
> parser itself by simply putting all of the fields at the top level and
> making them nullable. This means that the UDP stuff is null for TCP packets.
> 
> The same approach could be taken for other packets. If parsing is lazy,
> then reference to a parsed column would be required to trigger the parsing
> of a packet.
> 
> 
> 
> On Tue, Apr 23, 2019 at 10:52 AM Charles Givre <cgi...@gmail.com> wrote:
> 
>> Hi Ted
>> The library that gave me the idea is the Kaitai struct.  The java library
>> itself is released under the Apache or MIT license.  It can parse a number
>> of binary formats including DNS packets, ICMP and many others.  It accepts
>> a byte[] as input. I already wrote working code that reads it but I’m not
>> sure how to output these results in Drill.
>> 
>> Sent from my iPhone
>> 
>>> On Apr 23, 2019, at 12:45, Ted Dunning <ted.dunn...@gmail.com> wrote:
>>> 
>>> I think this would be very useful, particularly if it is easy to add
>>> additional parsing methods.
>>> 
>>> When I started to pcap work, I couldn't find any libraries that combined
>>> what we needed in terms of function and license.
>>> 
>>>> On Tue, Apr 23, 2019, 9:34 AM Charles Givre <cgi...@gmail.com> wrote:
>>>> 
>>>> Hello all,
>>>> I saw a few open source libraries that parse actual packet content and
>> was
>>>> interested in incorporating this into Drill's PCAP parser.  I was
>> thinking
>>>> initially of writing this as a UDF, however, I think it would be much
>>>> better to include this directly in Drill.  What I was thinking was to
>>>> create a field called parsed_packet that would be a Drill Map.  The
>>>> contents of this field would vary depending on the type of packet.  For
>>>> instance, if it is a DNS packet, you get all the DNS info, ICMP etc...
>>>> Does the community think this is a good idea?   Also, given the
>> structure
>>>> of the PCAP plugin, I'm not quite sure how to create a Map field with
>>>> variable contents.  Are there any examples that use the same
>> architecture
>>>> as the PCAP plugin?
>>>> Thanks,
>>>> -- C
>> 

Reply via email to