If you want to store it and manipulate the best format is integers (or binary) 
- it will allow all the fast operations of masking, subnet querying, etc. but 
text representation will require conversion. 
It highly depends on the use-case, but conversion to pgSQL's inet or cidr from 
integer is very straightforward for integers. Or you can store as a text, then 
the conversion will be done automatically on pgsql side, but operations in 
Arrow, e.g. comparison, hashing or sorting will be costly. 

> -----Original Message-----
> From: Kohei KaiGai [mailto:kai...@heterodb.com]
> Sent: Monday, April 29, 2019 8:20 PM
> To: dev@arrow.apache.org
> Subject: How about inet4/inet6/macaddr data types?
> 
> Hello folks,
> 
> How about your opinions about network address types support in Apache Arrow
> data format?
> Network address always appears at network logs massively generated by any
> network facilities, and it is a significant information when people analyze 
> their
> backward logs.
> 
> I'm working on Apache Arrow format mapping on PostgreSQL.
> http://heterodb.github.io/pg-strom/arrow_fdw/
> 
> This extension allows to read Arrow files as if PostgreSQL's table using 
> foreign
> table.
> Data types of Arrow shall be mapped to relevant PostgreSQL's data type
> according to the above documentation.
> 
> https://www.postgresql.org/docs/current/datatype-net-types.html
> PostgreSQL supports some network address types and operators.
> For example, we can put a qualifier like:   WHERE addr <<= inet
> '192.168.1.0/24' , to find out all
> the records in the subnet of '192.168.1.0/24'.
> 
> Probably, these three data types are now sufficient for most network
> logs: inet4, inet6 and macaddr.
> * inet4 is 32bit + optional 8bit (for netmask) fixed length array
> * inet6 is 128bit + optional 8bit (for netmask) fixed length array
> * macaddr is 48bit fixed length array.
> 
> I don't favor to map the inetX types on flexible length Binary data type, 
> because
> it takes 32bit offset to indicate 32 or 40bit value, inefficient so much, even
> though PostgreSQL allows to mix inet4/inet6 data types in a same column.
> 
> Thanks,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei <kai...@heterodb.com>

Reply via email to