Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-12 Thread Niall O'Reilly

On 11 Jul 2012, at 18:36, Valentin Davydov wrote:

> This is for IPv4 at least.

No.  This is for IPv4 ONLY.  That doesn't meet my needs.

/Niall

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-11 Thread Nico Williams
On Wed, Jul 11, 2012 at 12:36 PM, Valentin Davydov
 wrote:
> Individual IP addresses are nicely supported in the form of unsigned
> integers, and prefixes/ranges - as contiguous ranges of such integers.
> For example, to determine whether given IP address belongs to a particular
> subnet, one can calculate "IP between NETWORK_MIN and NETWORK_MAX", which
> sqlite does quite efficiently. This is for IPv4 at least.

Using ranges instead of prefixes is fraught with peril.  One has to
have triggers to ensure that ranges start and end on proper
boundaries, and then checking for overlap, or sorting ranges is much
harder than with the bit string approach.  And, as you point out, this
only works for IPv4 (since v6 addresses are 128-bit but SQLite3 only
has 64-bit integers).

That said, it's easy to represent bit strings as BLOBs that sort
properly, so perhaps all we need is a set of user-defined functions
for converting between IP addresses in string form and bit strings
encoded as BLOBs.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-11 Thread Valentin Davydov
On Mon, Jul 02, 2012 at 05:05:25PM +0100, Niall O'Reilly wrote:
> 
> On 2 Jul 2012, at 16:13, Nico Williams wrote:
> 
> > That reminds me: it'd be nice to have a bit string type, since the
> > correct way to sort IPv4 CIDR blocks is as bit strings.
> 
>   Nice, definitely!
> 
> > This is also
> > a proper way to sort IPv6 blocks.  Alternatively, it'd be nice to have
> > native IP address types in SQLite4, as otherwise one has to jump
> > through hoops to handle IP addresses properly.
> 
>   Bit strings would be more general.
>   Native IP would remove a sometimes-asserted motivation for preferring
>   PostgreSQL.
> 
>   As I see it, ranges, as well as single addresses and CIDR prefixes, 
>   need to be supported, perhaps like the Perl Net::IP module does.

Individual IP addresses are nicely supported in the form of unsigned
integers, and prefixes/ranges - as contiguous ranges of such integers.
For example, to determine whether given IP address belongs to a particular
subnet, one can calculate "IP between NETWORK_MIN and NETWORK_MAX", which 
sqlite does quite efficiently. This is for IPv4 at least.

Valentin Davydov.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Nico Williams
On Mon, Jul 2, 2012 at 1:14 PM, Simon Slavin  wrote:
> Your argument is for SQLite to allow users to implement their own affinities 
> (datatypes ?) must like SQLite3 allows users to implement collation 
> algorithms and functions.  But I don't think you're going to get something as 
> rarely used in SQLite4 when it's specially designed to be tiny and light.  
> SQLite stays that way by refusing to add rarely-used facilities.

No simple user-defined type is likely to allow for user-defined
literals for user-defined types.

What does matter, I think, is how commonly used some type might be.
IP addresses are very commonly used in relational DBs.

> But I still question whether you're engaged in premature optimization.  Would 
> things really be that much slower if you stored IPv4/6 as Higits ?  IPv4 
> stored as hex is readable, sortable, can be transferred from place to place 
> as an 8 character string and works perfectly with the standard distribution 
> of SQLite3 with no changes.  Implement conversion in- and out- functions 
> either as SQLite functions or in your programming language, as suits you.  
> You're concerned about the performance you lose by storing 10 octet strings 
> instead of having a custom type ?  Possibly if you're running a major ISP or 
> doing packet analysis and logging for one.  But you wouldn't be using SQLite 
> to do that anyway, you'd be running something with built-in caching and 
> redundancy.

Hex is not good enough for CIDR blocks (i.e., bit strings of length <=
sizeof(address-type)), unless you encode the bit string length mod 8
into the last byte, with appropriate zero-bit padding.  Then it's good
enough, but hardly user-friendly.

IP addresses are useful in databases that store configuration
information.  NICs, for example, must have such databases, and any
sufficiently large network is going to need one too.  I've seen a
number of home-grown Moira-like databases, and I maintained one called
UName*It (which used UniSQL/X underneath).  Any organization with
home-grown DNS management is going to need this, and any open source
and/or commercial DNS management tools will need this under the hood.
IP addresses are really quite common in databases.  This is why
Postgres has native IP address support.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Simon Slavin

On 2 Jul 2012, at 6:27pm, Nico Williams  wrote:

> A pair of built-in functions could take care of the user-friendliness
> aspect to some degree, but built-in literals for bit string and IPv4/6
> CIDR notation would so much more user-friendly...  IMO it's worth
> doing.  It's really quite common to store IP addresses in relational
> databases...

Your argument is for SQLite to allow users to implement their own affinities 
(datatypes ?) must like SQLite3 allows users to implement collation algorithms 
and functions.  But I don't think you're going to get something as rarely used 
in SQLite4 when it's specially designed to be tiny and light.  SQLite stays 
that way by refusing to add rarely-used facilities.

But I still question whether you're engaged in premature optimization.  Would 
things really be that much slower if you stored IPv4/6 as Higits ?  IPv4 stored 
as hex is readable, sortable, can be transferred from place to place as an 8 
character string and works perfectly with the standard distribution of SQLite3 
with no changes.  Implement conversion in- and out- functions either as SQLite 
functions or in your programming language, as suits you.  You're concerned 
about the performance you lose by storing 10 octet strings instead of having a 
custom type ?  Possibly if you're running a major ISP or doing packet analysis 
and logging for one.  But you wouldn't be using SQLite to do that anyway, you'd 
be running something with built-in caching and redundancy.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Nico Williams
On Mon, Jul 2, 2012 at 12:11 PM, Niall O'Reilly  wrote:
> On 2 Jul 2012, at 17:52, Nico Williams wrote:
>> So an IPv4 CIDR block like 10.2.93.128/25 would encode as x'0A025D81'
>> and 10.2.93.128/26 as x'0A025D82', and so on, with 10.2.93.128/32
>> encoded as x'0A025D8000' (that's 5 bytes).  That is, IPv4 addresses
>> would require one more byte than usual.
>
> You're missing some cases which I would find indispensible.
> I have a trip tomorrow.  I may be able to use the plane time
> to think about your examples above and to put together some
> complementary ones of my own.

Well, encoding bit strings (e.g., IP CIDR notation) as BLOBs is not at
all user-friendly.  But it does work for sorting.

A pair of built-in functions could take care of the user-friendliness
aspect to some degree, but built-in literals for bit string and IPv4/6
CIDR notation would so much more user-friendly...  IMO it's worth
doing.  It's really quite common to store IP addresses in relational
databases...

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Niall O'Reilly

On 2 Jul 2012, at 17:52, Nico Williams wrote:

> So an IPv4 CIDR block like 10.2.93.128/25 would encode as x'0A025D81'
> and 10.2.93.128/26 as x'0A025D82', and so on, with 10.2.93.128/32
> encoded as x'0A025D8000' (that's 5 bytes).  That is, IPv4 addresses
> would require one more byte than usual.

You're missing some cases which I would find indispensible.
I have a trip tomorrow.  I may be able to use the plane time
to think about your examples above and to put together some
complementary ones of my own.

/Niall

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Nico Williams
So an IPv4 CIDR block like 10.2.93.128/25 would encode as x'0A025D81'
and 10.2.93.128/26 as x'0A025D82', and so on, with 10.2.93.128/32
encoded as x'0A025D8000' (that's 5 bytes).  That is, IPv4 addresses
would require one more byte than usual.

I'm not sure that we can justify the extra complexity for a native bit
string type, but fwiw the literal syntax for it should probably be the
same as blob, with a / at the end.  Still, native bit
string support would be handy.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Nico Williams
Ah, if you encode any bit string as a BLOB such that it ends in 3 bits
that encode the length of the string mod 8, and with 7 - length of
string mod 8 preceding zero-valued bits then you get a result that
should sort [lexicographically] correctly, no?

So bit string would be a trivial extension of BLOB, and possibly no
special support should be required in SQLite (3 or 4).

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Nico Williams
The key is to come up with a bit string encoding in bytes that is
suitable for use in table keys -- they have to sort correctly when
sorted lexicographically.  The encoding should be reasonably
efficient; one byte per-bit, for example, would be too inefficient
(though in a pinch much better than no bit string support).  I'm
thinking of a 7 bits / byte encoding, with the remaining indicating
the last byte of the encoding, which byte would contain up to 4 bits
of the bit string plus 3 bits to encode the length of the bit string
mod 8.

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] IP address support (was: Consequences of lexicographic sorting of keys in SQLite4?)

2012-07-02 Thread Niall O'Reilly

On 2 Jul 2012, at 16:13, Nico Williams wrote:

> That reminds me: it'd be nice to have a bit string type, since the
> correct way to sort IPv4 CIDR blocks is as bit strings.

Nice, definitely!

> This is also
> a proper way to sort IPv6 blocks.  Alternatively, it'd be nice to have
> native IP address types in SQLite4, as otherwise one has to jump
> through hoops to handle IP addresses properly.

Bit strings would be more general.
Native IP would remove a sometimes-asserted motivation for preferring
PostgreSQL.

As I see it, ranges, as well as single addresses and CIDR prefixes, 
need to be supported, perhaps like the Perl Net::IP module does.

With some care over the encoding, a natural ordering arises which
places nested prefixes, ranges, and individual addresses in the
"right" order.  This would eliminate as much as possible of the
hoop-jumping.

I'll try to put together some examples of as illustrations.

/Niall

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users