Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread Laurent Martelli


Le 08/10/2015 01:40, Carlo a écrit :


>> How many cores do you have on that machine?

Test if limiting number of simultaneous feeds, like bringing their 
number down to half of your normal connections has the same positive 
effect.


<<

I am told 32 cores on a LINUX VM. The operators have tried limiting 
the number of threads. They feel that the number of connections is 
optimal. However, under the same conditions they noticed a sizable 
boost in performance if the same import was split into two successive 
imports which had shorter transactions.


I am just looking to see if there is any reason to think that lock 
contention (or anything else) over longer vs. shorter single-row-write 
transactions under the same conditions might explain this.


I don't think inserts can cause contention on the server. Insert do not 
lock tables during the transaction. You may have contention on sequence 
but it won't vary with transaction size.


Have you checked the resource usage (CPU,memory) on the client side ?

How do you insert rows ? Do you use plain postgres API ?

Regards,
Laurent


Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread k...@rice.edu
On Thu, Oct 08, 2015 at 05:43:11PM -0400, Carlo wrote:
> -Original Message-
> From: k...@rice.edu [mailto:k...@rice.edu] 
> Sent: October 8, 2015 1:00 PM
> To: Carlo
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] One long transaction or multiple short transactions?
> 
> On Thu, Oct 08, 2015 at 11:08:55AM -0400, Carlo wrote:
> > >> Sounds like a locking problem
> > 
> > This is what I am trying to get at. The reason that I am not 
> > addressing hardware or OS configuration concerns is that this is not 
> > my environment, but my client's. The client is running my import 
> > software and has a choice of how long the transactions can be. They 
> > are going for long transactions, and I am trying to determine whether 
> > there is a penalty for single long transactions over a configuration 
> > which would allow for more successive short transactions. (keep in mind
> all reads and writes are single-row).
> > 
> > There are other people working on hardware and OS configuration, and 
> > that's why I can't want to get into a general optimization discussion 
> > because the client is concerned with just this question.
> > 
> 
> On October 8, 2015 1:00 PM Ken wrote:
> > Hi Carlo,
> 
> > Since the read/writes are basically independent, which is what I take your
> "single-row" comment to mean, by batching them you are balancing two 
> > opposing factors. First, larger batches allow you to consolodate I/O and
> other resource requests to make them more efficient per row. Second, larger 
> > batches  require more locking as the number of rows updated grows. It may
> very well be the case that by halving your batch size that the system can 
> > process them more quickly than a single batch that is twice the size.
> 
> Just to clarify, one transaction of this type may take longer to commit than
> two successive transactions of half the size?
> 

Yes, but where the optimum count is located should be determined by testing.
Just varying the batch size and note where the performance is at a maximum.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread Carlo
-Original Message-
From: k...@rice.edu [mailto:k...@rice.edu] 
Sent: October 8, 2015 1:00 PM
To: Carlo
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] One long transaction or multiple short transactions?

On Thu, Oct 08, 2015 at 11:08:55AM -0400, Carlo wrote:
> >> Sounds like a locking problem
> 
> This is what I am trying to get at. The reason that I am not 
> addressing hardware or OS configuration concerns is that this is not 
> my environment, but my client's. The client is running my import 
> software and has a choice of how long the transactions can be. They 
> are going for long transactions, and I am trying to determine whether 
> there is a penalty for single long transactions over a configuration 
> which would allow for more successive short transactions. (keep in mind
all reads and writes are single-row).
> 
> There are other people working on hardware and OS configuration, and 
> that's why I can't want to get into a general optimization discussion 
> because the client is concerned with just this question.
> 

On October 8, 2015 1:00 PM Ken wrote:
> Hi Carlo,

> Since the read/writes are basically independent, which is what I take your
"single-row" comment to mean, by batching them you are balancing two 
> opposing factors. First, larger batches allow you to consolodate I/O and
other resource requests to make them more efficient per row. Second, larger 
> batches  require more locking as the number of rows updated grows. It may
very well be the case that by halving your batch size that the system can 
> process them more quickly than a single batch that is twice the size.

Just to clarify, one transaction of this type may take longer to commit than
two successive transactions of half the size?



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread k...@rice.edu
On Thu, Oct 08, 2015 at 11:08:55AM -0400, Carlo wrote:
> >> Sounds like a locking problem
> 
> This is what I am trying to get at. The reason that I am not addressing
> hardware or OS configuration concerns is that this is not my environment,
> but my client's. The client is running my import software and has a choice
> of how long the transactions can be. They are going for long transactions,
> and I am trying to determine whether there is a penalty for single long
> transactions over a configuration which would allow for more successive
> short transactions. (keep in mind all reads and writes are single-row). 
> 
> There are other people working on hardware and OS configuration, and that's
> why I can't want to get into a general optimization discussion because the
> client is concerned with just this question.
> 

Hi Carlo,

Since the read/writes are basically independent, which is what I take your
"single-row" comment to mean, by batching them you are balancing two
opposing factors. First, larger batches allow you to consolodate I/O and
other resource requests to make them more efficient per row. Second, larger
batches  require more locking as the number of rows updated grows. It may
very well be the case that by halving your batch size that the system can
process them more quickly than a single batch that is twice the size.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread Carlo
>> Sounds like a locking problem

This is what I am trying to get at. The reason that I am not addressing
hardware or OS configuration concerns is that this is not my environment,
but my client's. The client is running my import software and has a choice
of how long the transactions can be. They are going for long transactions,
and I am trying to determine whether there is a penalty for single long
transactions over a configuration which would allow for more successive
short transactions. (keep in mind all reads and writes are single-row). 

There are other people working on hardware and OS configuration, and that's
why I can't want to get into a general optimization discussion because the
client is concerned with just this question.

-Original Message-
From: Graeme B. Bell [mailto:graeme.b...@nibio.no] 
Sent: October 8, 2015 4:55 AM
To: Carlo
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] One long transaction or multiple short transactions?

Sounds like a locking problem, but assuming you aren't sherlock holmes and
simply want to get the thing working as soon as possible: 

Stick a fast SSD in there (whether you stay on VM or physical). If you have
enough I/O, you may be able to solve the problem with brute force.
SSDs are a lot cheaper than your time. 

Suggest you forward this to your operators: a talk I have about optimising
multi-threaded work in postgres:  

  http://graemebell.net/foss4gcomo.pdf (Slides: "Input/Output" in the
middle of the talk and also the slides at the end labelled "For Techies")

Graeme Bell

p.s. You mentioned a VM. Consider making the machine physical and not VM.
You'll get a performance boost and remove the risk of DB corruption from
untrustworthy VM fsyncs. One day there will be a power cut or O/S crash
during these your writes and with a VM you've a reasonable chance of nuking
your DB because VM virtualised storage often doesn't honour fsync (for
performance reasons), but it's fundamental to correct operation of PG. 



> On 08 Oct 2015, at 01:40, Carlo  wrote:
> 
> 
> I am told 32 cores on a LINUX VM. The operators have tried limiting the
number of threads. They feel that the number of connections is optimal.
However, under the same conditions they noticed a sizable boost in
performance if the same import was split into two successive imports which
had shorter transactions.
>  




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt



Op 08-10-15 om 15:10 schreef Graeme B. Bell:

http://initd.org/psycopg/docs/usage.html#large-objects


"Psycopg large object support *efficient* import/export with file system files 
using the lo_import() and lo_export() libpq functions.”

See *


I was under the impression they meant that the lobject was using lo_import and 
lo_export.
I can't seem to find how to use lo_import en export, I searched google and came 
to the conclusion the lobject was the way to go.

x.lo_import()

Traceback (most recent call last):
   File "", line 1, in 
AttributeError: 'psycopg2._psyco

Bram,

I recommend posting this as a question on a python/psycopg mailing list, for 
advice.
You are probably not the first person to encounter it.

Graeme Bell




Hi,

I tried \lo_import with psql and it's not faster, the same 5.5megabytes/sec.

Bram


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell

>> 
>> http://initd.org/psycopg/docs/usage.html#large-objects
>> 
>> 
>> "Psycopg large object support *efficient* import/export with file system 
>> files using the lo_import() and lo_export() libpq functions.”
>> 
>> See *
>> 
> I was under the impression they meant that the lobject was using lo_import 
> and lo_export.
> I can't seem to find how to use lo_import en export, I searched google and 
> came to the conclusion the lobject was the way to go.
> >>> x.lo_import()
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: 'psycopg2._psyco

Bram,

I recommend posting this as a question on a python/psycopg mailing list, for 
advice.
You are probably not the first person to encounter it.

Graeme Bell



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt



Op 08-10-15 om 14:10 schreef Graeme B. Bell:

On 08 Oct 2015, at 13:50, Bram Van Steenlandt  wrote:

1. The part is "fobj = lobject(db.db,0,"r",0,fpath)", I don't think there is 
anything there

Re: lobject

http://initd.org/psycopg/docs/usage.html#large-objects

"Psycopg large object support *efficient* import/export with file system files 
using the lo_import() and lo_export() libpq functions.”

See *
I was under the impression they meant that the lobject was using 
lo_import and lo_export.
I can't seem to find how to use lo_import en export, I searched google 
and came to the conclusion the lobject was the way to go.

>>> x.lo_import()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'psycopg2._psycopg.connection' object has no attribute 
'lo_import'


>>> from psycopg2.extensions import lo_importTraceback (most recent 
call last):

  File "", line 1, in 
ImportError: cannot import name lo_import

Also:
http://initd.org/psycopg/docs/connection.html

|lobject|([/oid/[, /mode/[, /new_oid/[, /new_file/[, /lobject_factory/])

   Return a new database large object as a |lobject|
   
   instance.

   See Access to PostgreSQL large objects
    for an
   overview.

   Parameters:  

 * *oid* – The OID of the object to read or write. 0 to create a
   new large object and and have its OID assigned automatically.
 * *mode* – Access mode to the object, see below.
 * *new_oid* – Create a new object using the specified OID. The
   function raises |OperationalError|
   
   if the OID is already in use. Default is 0, meaning assign a new
   one automatically.
 * *new_file* – The name of a file to be imported in the the
   database (using the |lo_import()|
   

   function)
 * *lobject_factory* – Subclass of |lobject|
   

   to be instantiated.




lobject seems to default to string handling in Python
That’s going to be slow.
Try using lo_import / export?

Graeme Bell





Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell

> On 08 Oct 2015, at 13:50, Bram Van Steenlandt  wrote:
>>> 1. The part is "fobj = lobject(db.db,0,"r",0,fpath)", I don't think there 
>>> is anything there

Re: lobject

http://initd.org/psycopg/docs/usage.html#large-objects

"Psycopg large object support *efficient* import/export with file system files 
using the lo_import() and lo_export() libpq functions.”

See *

lobject seems to default to string handling in Python
That’s going to be slow.
Try using lo_import / export?

Graeme Bell

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt



Op 08-10-15 om 13:37 schreef Graeme B. Bell:



Like this ?

gmirror (iozone -s 4 -a /dev/mirror/gm0s1e) = 806376 (faster drives)
zfs uncompressed (iozone -s 4 -a /datapool/data) = 650136
zfs compressed (iozone -s 4 -a /datapool/data) = 676345


If you can get the complete tables (as in the images on the blog post) with 
random performance compared to sequential etc, different block sizes, that 
would be very interesting.




Like this ?

Command line used: iozone -a /dev/mirror/gm0s1e
Output is in Kbytes/sec
Time Resolution = 0.01 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random 
randombkwd   record   stride
  KB  reclen   write rewritereadrereadread 
writeread  rewrite read   fwrite frewrite   fread  freread
  64   4  321554  913649  2689580  2892445 2133730 
874936 1879725   913649  1933893  1013707  1066042 2006158  2561267
  64   8  603489 1722886  4897948  5860307 3791156 
1947927 3022727  1879725  3791156  1828508  1768283 3541098  4564786
  64  16 1083249 3363612  7940539  9006179 7100397 
3541098 3958892  3165299  7100397  3588436  3363612 4897948  6421025
  64  32 1392258 5283570 12902017 1082152412902017 
4897948 5860307  4564786  7100397  4207076  4274062 5389653  7940539
  64  64 2772930 5735102 20962191 2096219115972885 
7940539 7940539  7940539 12902017  7100397  5860307 4274062  9318832
 128   4  253503 1093856  2727923  2714132 2420455 
1048974 2286447  1057236  2058509   948860  1051027 2464907  2558895
 128   8  465304 1975201  4934216  5325799 4557257 
1975201 2286447  1133103  4267461  1878447  1975201 3657016  4557257
 128  16  821147 3468030  6114306  8036304 6114306 
3468030 5122535  3359523  6727225  3560017  2969325 5122535  6406138
 128  32 1504659 6045455 11720614 1254204311470204 
5325799 8036304  6045455  9977956  5847904  6114306 6727225  8548124
 128  64 2621367 5784891 16365173 1588107815881078 
8036304 9795896  8036304  9129573  8036304  7082197 6727225  9795896
 128 128 086 9129573 18637664 
10779307180123591056714011470204 10779307 14200794 10567140  9795896 
3380677  6812590
 256   4  390763  951219  2692393  2692393 2325094 
958864 2310087   955451  2305128   984356  1004618 2435861  2441400
 256   8  736041 1894373  4572895  5022044 446 
1729594 3823789  1729594  4132864  1778290  1729594 4197489  4332998
 256  16 1280084 3368013  7120034  8024634 7120034 
3043436 6249745  3275543  7120034  3410808  3285566 5117791  6249745
 256  32 2152625 4734192 11091721 1222861210245071 
5540300 8815202  5569035  9868433  5687020  5810112 6554972  8208677
 256  64 3245838 7518900 16072608 1709624916072608 
802463411695808  8534922 13454450  7518900  8024634 8271916  9868433
 256 128 490728410651598 17096249 
19591792170962491024507113454450 11695808 17096249 11091721  8815202 
7314033 10245071
 256 256 5022044 9868433 18259146 1959179218259146 
986843314353744 10651598 16828306 10651598  9518507 3326279 10245071
 512   4  533297  948197  2695115  2768068 2327124 
944860 2287463   930933  2394591   984720   988345 2486072  2497638
 512   8  953248 1931527  5019764  5127637 4494470 
1882427 4228947  1875849  4420457  1938502  1938502 4061007  4420457
 512  16 1571169 3510074  7758090  7900804 7091952 
3437042 6652558  3366987  6931711  3533174  3437042 5951911  6246213
 512  32 2327124 5951911 11620224 1249949011138071 
5822804 9814569  5807059 11138071  5886650  5822804 6821616  7647578
 512  64 3163620 7647578 15471149 1654383213872122 
801881212427157  8992598 16049269  8394979  8394979 8992598  9859630
 512 128 5951911 9859630 17069844 
19037014176304041137404015143846 12797441 18229030 11620224 10641343 
8665997 10235583
 512 256 582280410641343 18385093 
18385093176304041069433615471149 11943357 17630404 10911694 10911694 
7758090  9859630
 512 512 601863610434519 18869737 1955712419037014 
985963015930214 11138071 13872122 10911694 10235583 3415178 10044089
1024   4  643652  991266  2409105  2716948 2354947 
940269 2290886   942952  2338280   992412  1004951 2364021  2472911
1024   8 1120580 1858644  4904018  5042191 4415030 
1837965 3999762  1774922  4146499  1822368  1806273 4215688  4321737
1024  16 1769073 3469823  7699755  7755368 7054742 
3355952 6787181  3311958  6874084  3458646  3390391 6059442  6172653
1024  32 2474336 5472650 11132462 1124909110255274 
6059442 9946527  5

Re: [PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt



Op 08-10-15 om 13:13 schreef Graeme B. Bell:

1. The part is "fobj = lobject(db.db,0,"r",0,fpath)", I don't think there is 
anything there

Can you include the surrounding code please (e.g. setting up the db connection) 
so we can see what’s happening, any sync/commit type stuff afterwards.

connect:
self.db = 
psycopg2.connect(dbname=self.confighelp.get("dbname"),user=self.confighelp.get("dbuser"),password=self.confighelp.get("dbpassword"),host=self.confighelp.get("dbhost"),port=int(self.confighelp.get("dbport")),sslmode=self.confighelp.get("dbsslmode"))



upload:
   self.statusupdate("Backing up %s 
(%s)"%(fpath,nicesizeprint(size)))

starttime =datetime.datetime.now()
try:
fobj = lobject(db.db,0,"r",0,fpath)
except psycopg2.OperationalError,e:
if e.__str__().find("could not open 
file")>-1:

badfiles.append([fpath,str(e).rstrip("\n").rstrip("\r")])
self.statusupdate("Can't backup 
%s"%fpath)

else:
self.emsg = str(e)
return False
except Exception,e:
self.emsg= str(e)
return False
else:
cursor.execute("insert into ${table} 
(set,path,modtime,size,file,basepath) values 
(%s,%s,%s,%s,%s,%s)".replace("${table}",tablename),[bset,urpath,modtime,size,fobj.oid,path])

db.commit()



2.gigabit ethernet, the scp copy I did was over the network to that harddrive 
using
scp FreeBSD-10.1-RELEASE-amd64-dvd1.iso x.x.x.x:/datapool/db/test

3.I agree but if scp can write to the drive at 37mb/sec, I should be able to 
achieve more than 8mb/sec.
I can indeed speed up the ZFS but it's more the difference between scp and my 
script that bugs me.

It is either being caused by

a) your script
b) postgres working in a different way to scp

To solve the first you need to send us more of your script
To solve the second, it may be possible to reconfigure postgres but you may 
have to reconfigure your OS or hardware to be more suitable for the type of 
thing postgres does.

Put simply, scp does the absolute minimum of work to put the data onto the disk 
without any safety measures.
Postgres is doing other things in the background - analyzing things, keep a 
synchronous log for rollback etc.. Crucially it’s using it’s own internal 
storage format.
You’re comparing chalk with cheese and expecting them to taste quite similar.
I agree, my question is also more, what can I do to make it easier for 
postgresql, can I turn things off that will speed things up.


If you try the advice I gave + read the blog post, about configuring ZFS to be 
friendly to the type of activity postgres likes to do, you may see some 
improvement.

If the problem is your script you’ll need to send a greater amount of the code 
so it can be examined.


4.
dd bs=1M count=256 if=/dev/zero of=./test
256+0 records in
256+0 records out
268435456 bytes transferred in 5.401050 secs (49700605 bytes/sec)

good


5. a tgz file with scp is 33.8MB/sec.

(you can speed that up probably by changing to a lightweightcompression 
algorithm)


6. the server is running all the time, speed varies , it's between 5 and 
8mb/sec actually (depending also on the number of clients performing a backup).

How is the connection to postgres being made, incidentally?

Graeme.








Op 08-10-15 om 11:45 schreef Graeme B. Bell:

Seems a bit slow.

1. Can you share the script (the portion that does the file transfer) to the 
list? Maybe you’re doing something unusual there by mistake.
Similarly the settings you’re using for scp.

2. What’s the network like?
For example, what if the underlying network is only capable of 10MB/s peak, and 
scp is using compression and the files are highly compressible?
Have you tried storing zip or gzip’d versions of the file into postgres? 
(that’s probably a good idea anyway)

3. ZFS performance can depend on available memory and use of caches (memory + 
L2ARC for reading, ZIL cache for writing).
Maybe put an intel SSD in there (or a pair of them) and use it as a ZIL cache.

4. Use dd to measure the write performance of ZFS doing a local write to the 
machine. What speed do you get?

5. Transfer a zip’d file over the network using scp. What speed do you get?

6. Is your postgres running all the time or do you start it before this test? 
Perhaps check if any background tasks are running when you use postgres - 
autovacuum, autoanalyze etc.

Graeme Bell


On 08 Oct 2015, at 11:17, Bram Van Steenlandt  wrote:

Hi,

I use postgresql often but I'm not very familiar with how it works internal.

I've made a small script to backup files from different computers to a 
po

Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell
>> 
>> 
> Like this ?
> 
> gmirror (iozone -s 4 -a /dev/mirror/gm0s1e) = 806376 (faster drives)
> zfs uncompressed (iozone -s 4 -a /datapool/data) = 650136
> zfs compressed (iozone -s 4 -a /datapool/data) = 676345


If you can get the complete tables (as in the images on the blog post) with 
random performance compared to sequential etc, different block sizes, that 
would be very interesting.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt



Op 08-10-15 om 13:21 schreef Graeme B. Bell:

First the database was on a partition where compression was enabled, I changed 
it to an uncompressed one to see if it makes a difference thinking maybe the 
cpu couldn't handle the load.

It made little difference in my case.

My regular gmirror partition seems faster:
dd bs=8k count=25600 if=/dev/zero of=./test
25600+0 records in
25600+0 records out
209715200 bytes transferred in 1.513112 secs (138598612 bytes/sec)

the zfs compressed partition also goes faster:
dd bs=8k count=25600 if=/dev/zero of=./test
25600+0 records in
25600+0 records out
209715200 bytes transferred in 0.979065 secs (214199479 bytes/sec)
but this one didn't really go that fast in my test (maybe 10%)


Please can you run iozone and look for low random write performance with small 
blocks?  (4k)
http://www.slashroot.in/linux-file-system-read-write-performance-test

Like this ?

gmirror (iozone -s 4 -a /dev/mirror/gm0s1e) = 806376 (faster drives)
zfs uncompressed (iozone -s 4 -a /datapool/data) = 650136
zfs compressed (iozone -s 4 -a /datapool/data) = 676345


Also please can you CC to the list with your replies to my on-list emails?

Graeme Bell






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell

>> First the database was on a partition where compression was enabled, I 
>> changed it to an uncompressed one to see if it makes a difference thinking 
>> maybe the cpu couldn't handle the load.
> It made little difference in my case.
> 
> My regular gmirror partition seems faster:
> dd bs=8k count=25600 if=/dev/zero of=./test
> 25600+0 records in
> 25600+0 records out
> 209715200 bytes transferred in 1.513112 secs (138598612 bytes/sec)
> 
> the zfs compressed partition also goes faster:
> dd bs=8k count=25600 if=/dev/zero of=./test
> 25600+0 records in
> 25600+0 records out
> 209715200 bytes transferred in 0.979065 secs (214199479 bytes/sec)
> but this one didn't really go that fast in my test (maybe 10%)


Please can you run iozone and look for low random write performance with small 
blocks?  (4k)
http://www.slashroot.in/linux-file-system-read-write-performance-test

Also please can you CC to the list with your replies to my on-list emails?

Graeme Bell



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell

> On 08 Oct 2015, at 11:17, Bram Van Steenlandt  wrote:
> 
> The database (9.2.9) on the server (freebsd10) runs on a zfs mirror.
> If I copy a file to the mirror using scp I get 37MB/sec
> My script achieves something like 7 or 8MB/sec on large (+100MB) files.


This may help - great blog article about ZFS with postgres and how use you can 
zfs compression to boost i/o performance substantially.
If your machine is making a lot of smaller writes in postgres (as opposed to 
presumably large writes by scp) then this may alleviate things a bit.

https://www.citusdata.com/blog/64-zfs-compression

Graeme Bell

p.s. Apologies for top-posting on my previous message.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] large object write performance

2015-10-08 Thread Graeme B. Bell
Seems a bit slow.

1. Can you share the script (the portion that does the file transfer) to the 
list? Maybe you’re doing something unusual there by mistake.
Similarly the settings you’re using for scp. 

2. What’s the network like?
For example, what if the underlying network is only capable of 10MB/s peak, and 
scp is using compression and the files are highly compressible?
Have you tried storing zip or gzip’d versions of the file into postgres? 
(that’s probably a good idea anyway)

3. ZFS performance can depend on available memory and use of caches (memory + 
L2ARC for reading, ZIL cache for writing).
Maybe put an intel SSD in there (or a pair of them) and use it as a ZIL cache. 

4. Use dd to measure the write performance of ZFS doing a local write to the 
machine. What speed do you get?

5. Transfer a zip’d file over the network using scp. What speed do you get?

6. Is your postgres running all the time or do you start it before this test? 
Perhaps check if any background tasks are running when you use postgres - 
autovacuum, autoanalyze etc. 

Graeme Bell

> On 08 Oct 2015, at 11:17, Bram Van Steenlandt  wrote:
> 
> Hi,
> 
> I use postgresql often but I'm not very familiar with how it works internal.
> 
> I've made a small script to backup files from different computers to a 
> postgresql database.
> Sort of a versioning networked backup system.
> It works with large objects (oid in table, linked to large object), which I 
> import using psycopg
> 
> It works well but slow.
> 
> The database (9.2.9) on the server (freebsd10) runs on a zfs mirror.
> If I copy a file to the mirror using scp I get 37MB/sec
> My script achieves something like 7 or 8MB/sec on large (+100MB) files.
> 
> I've never used postgresql for something like this, is there something I can 
> do to speed things up ?
> It's not a huge problem as it's only the initial run that takes a while 
> (after that, most files are already in the db).
> Still it would be nice if it would be a little faster.
> cpu is mostly idle on the server, filesystem is running 100%.
> This is a seperate postgresql server (I've used freebsd profiles to have 2 
> postgresql server running) so I can change this setup so it will work better 
> for this application.
> 
> I've read different suggestions online but I'm unsure which is best, they all 
> speak of files which are only a few Kb, not 100MB or bigger.
> 
> ps. english is not my native language
> 
> thx
> Bram
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] large object write performance

2015-10-08 Thread Bram Van Steenlandt

Hi,

I use postgresql often but I'm not very familiar with how it works internal.

I've made a small script to backup files from different computers to a 
postgresql database.

Sort of a versioning networked backup system.
It works with large objects (oid in table, linked to large object), 
which I import using psycopg


It works well but slow.

The database (9.2.9) on the server (freebsd10) runs on a zfs mirror.
If I copy a file to the mirror using scp I get 37MB/sec
My script achieves something like 7 or 8MB/sec on large (+100MB) files.

I've never used postgresql for something like this, is there something I 
can do to speed things up ?
It's not a huge problem as it's only the initial run that takes a while 
(after that, most files are already in the db).

Still it would be nice if it would be a little faster.
cpu is mostly idle on the server, filesystem is running 100%.
This is a seperate postgresql server (I've used freebsd profiles to have 
2 postgresql server running) so I can change this setup so it will work 
better for this application.


I've read different suggestions online but I'm unsure which is best, 
they all speak of files which are only a few Kb, not 100MB or bigger.


ps. english is not my native language

thx
Bram


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-08 Thread Graeme B. Bell
Sounds like a locking problem, but assuming you aren’t sherlock holmes and 
simply want to get the thing working as soon as possible: 

Stick a fast SSD in there (whether you stay on VM or physical). If you have 
enough I/O, you may be able to solve the problem with brute force.
SSDs are a lot cheaper than your time. 

Suggest you forward this to your operators: a talk I have about optimising 
multi-threaded work in postgres:  

  http://graemebell.net/foss4gcomo.pdf (Slides: “Input/Output” in the 
middle of the talk and also the slides at the end labelled “For Techies")

Graeme Bell

p.s. You mentioned a VM. Consider making the machine physical and not VM. 
You’ll get a performance boost and remove the risk of DB corruption from 
untrustworthy VM fsyncs. One day there will be a power cut or O/S crash during 
these your writes and with a VM you’ve a reasonable chance of nuking your DB 
because VM virtualised storage often doesn’t honour fsync (for performance 
reasons), but it’s fundamental to correct operation of PG. 



> On 08 Oct 2015, at 01:40, Carlo  wrote:
> 
> 
> I am told 32 cores on a LINUX VM. The operators have tried limiting the 
> number of threads. They feel that the number of connections is optimal. 
> However, under the same conditions they noticed a sizable boost in 
> performance if the same import was split into two successive imports which 
> had shorter transactions.
>  



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance