Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate
On Mon, 10 Dec 2018 16:10:23 + Paolo Lucente wrote: > Thanks for your input - always very interesting. I sketched a new note > in the SQL docs basing on the thread so far: > > https://github.com/pmacct/pmacct/commit/e4c594a9f10040b53d4523f7edecf455a20a9151 > > Hope it looks right enough. Looks ok to me. I can't speak to the performance implications of fixed v.s. variable length string columns in MySQL. And it probably depends on the back-end storage engine anyway. What you have leaves everything to do with performance up to the end-user, which is fine. You might consider altering sql/pmacct-create-db.pgsql so that the create database is: CREATE DATABASE pmacct TEMPLATE=template0 LC_COLLATE='C'; This should work from Postgres 8.4, released 2009-07-01 and out of support July 24, 2014. RHEL 6 includes Postgres 8.4 (or so I'm told here https://developers.redhat.com/blog/2018/04/06/red-hat-open-source-data-bases-beta-adds-postgresql-10-mongodb-3-6-updates-mysql-5-7/ ). RHEL 5 is no longer supported. It uses template0 because in theory template1 (the default) could have been modified by the local db admin to contain data which is not compatible with the C collation. In practice template1 is probably never different from template0. Most likely, the OS is using UTF8. And the PG cluster will be created using the OS default. So the db will be able to contain UTF8 characters, which I presume is what you want, but sorting/indexing will be in UTF8 codepoint order. Much faster than a dictionary sort. While you're at it, a note in pmacct-create-db.pgsql to increase "shared_buffers" in postgresql.conf might also be useful. The default is 128M, which is a very small amount of memory these days. Something like: -- NOTE: Database performance is _much_ improved with minimal -- tuning. Start by allocating more RAM to the database. -- Increase the "shared_buffers" setting in -- postgresql.conf to as much as 25% of RAM. -- Consider also setting "effective_cache_size". It should -- be larger than shared_buffers. "checkpoint_segments" -- should probably be at least 32, much more if write -- performance is an issue. For details see: --https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server > On Sun, Dec 09, 2018 at 07:13:34PM -0600, Karl O. Pinc wrote: > > On Fri, 7 Dec 2018 16:42:31 + > > Paolo Lucente wrote: > > > > > You could make the field variable-length - optimizing space and > > > avoiding you the try & error of finding the sweet spot size for > > > the as path field at the expense of more computing. > > > > I would not expect a Postgres TEXT column, which is variable > > length, to be significantly more CPU intensive than a > > fixed length CHAR column. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate
Hi Karl, Thanks for your input - always very interesting. I sketched a new note in the SQL docs basing on the thread so far: https://github.com/pmacct/pmacct/commit/e4c594a9f10040b53d4523f7edecf455a20a9151 Hope it looks right enough. Paolo On Sun, Dec 09, 2018 at 07:13:34PM -0600, Karl O. Pinc wrote: > On Fri, 7 Dec 2018 16:42:31 + > Paolo Lucente wrote: > > > You could make the field variable-length - optimizing space and > > avoiding you the try & error of finding the sweet spot size for the > > as path field at the expense of more computing. > > I would not expect a Postgres TEXT column, which is variable > length, to be significantly more CPU intensive than a > fixed length CHAR column. > > From the PG docs regarding the various types which store strings: > > --- > There is no performance difference among these three types, apart from > increased storage space when using the blank-padded type, and a few extra CPU > cycles to check the length when storing into a length-constrained column. > While character(n) has performance advantages in some other database systems, > there is no such advantage in PostgreSQL; in fact character(n) is usually the > slowest of the three because of its additional storage costs. In most > situations text or character varying should be used instead. > --- > > So, when using Postgres, there's no particular reason to > limit string column lengths. > > What is important with Postgres is the collation sequence > used by the database. UTF-8 collation support slows down sorting > significantly, and so affects indexes and so forth. > If you need UTF-8 characters you can do that, but > sort them by byte-code values if you care about performance. > If necessary collation can even be set down to the column > level in the newest PG. What's easiest is to avoid UTF-8 > entirely and set the locale for the whole db to "C". > This gives you an entire byte to store each character > and characters sorted by byte-code value. > > https://www.postgresql.org/docs/11/charset.html > > > On Fri, Dec 07, 2018 at 10:46:32AM +0100, Fabien VINCENT wrote: > > > Dear List, > > > > > > I've an issue when nfacctd try to push to pgsql database : > > > > > > PGSQL log file : > > > > > > ERROR: value too long for type character(80) > > > CONTEXT: > > > > > > COPY flow _*_, line 74771, column as_path_src: "14061 > > > {46652,421001,4210010200,4210010201,4210010202,4210010297,4210010400,4210010402,4210010499..." > > > > > > > > > In my nfacctd config file I've : > > > > > > bgp_aspath_radius: 10 > > > > > > because as_path_src is set to CHAR(80). But seems BGP aggregates > > > break the rules ? > > > > > > Is there anyway to limit / cut down BGP aggregates in column > > > as_path_src ? > > Karl > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] bgp_as_path_radius / pgsql and bgp_aggregate
On Fri, 7 Dec 2018 16:42:31 + Paolo Lucente wrote: > You could make the field variable-length - optimizing space and > avoiding you the try & error of finding the sweet spot size for the > as path field at the expense of more computing. I would not expect a Postgres TEXT column, which is variable length, to be significantly more CPU intensive than a fixed length CHAR column. >From the PG docs regarding the various types which store strings: --- There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead. --- So, when using Postgres, there's no particular reason to limit string column lengths. What is important with Postgres is the collation sequence used by the database. UTF-8 collation support slows down sorting significantly, and so affects indexes and so forth. If you need UTF-8 characters you can do that, but sort them by byte-code values if you care about performance. If necessary collation can even be set down to the column level in the newest PG. What's easiest is to avoid UTF-8 entirely and set the locale for the whole db to "C". This gives you an entire byte to store each character and characters sorted by byte-code value. https://www.postgresql.org/docs/11/charset.html > On Fri, Dec 07, 2018 at 10:46:32AM +0100, Fabien VINCENT wrote: > > Dear List, > > > > I've an issue when nfacctd try to push to pgsql database : > > > > PGSQL log file : > > > > ERROR: value too long for type character(80) > > CONTEXT: > > > > COPY flow _*_, line 74771, column as_path_src: "14061 > > {46652,421001,4210010200,4210010201,4210010202,4210010297,4210010400,4210010402,4210010499..." > > > > > > In my nfacctd config file I've : > > > > bgp_aspath_radius: 10 > > > > because as_path_src is set to CHAR(80). But seems BGP aggregates > > break the rules ? > > > > Is there anyway to limit / cut down BGP aggregates in column > > as_path_src ? Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists