Hi Mate,

Thanks for the detailed response. This will help others in the same
situation

Dave Cramer

da...@postgresintl.com
www.postgresintl.com


On Thu, 6 Sep 2018 at 05:03, Mate Varga <m...@matevarga.net> wrote:

> Hi,
>
> summarizing:
> we had a table that had an OID column, referencing an object in
> pg_largeobject. This was mapped to a (Java) entity with a byte array field,
> annotated with @Lob. The problem was that we were fetching thousands of
> these entities in one go, and LOB fetching is not batched by Hibernate/JDBC
> (so each row is fetched separately). Because we were abusing LOBs (they
> were small, often less than 10 kB), we have chosen to move the binary blobs
> from the LO table to a simple bytea column. So the entity that had a byte
> array field mapped to an OID column now has a byte array field mapped to a
> bytea column, and we have manually moved data from the LO table to the
> bytea column. Now Hibernate/JDBC fetches all the content we need in
> batches. Random benchmark: fetching 20k rows used to take 7 seconds (250
> msec query execution time, 6.7 sec for transfer) and now it takes 1.5
> seconds (250 msec query + 1.3 sec transfer).
>
> Regards,
> Mate
>
> On Thu, Sep 6, 2018 at 10:56 AM Dave Cramer <p...@fastcrypt.com> wrote:
>
>> Hi
>>
>> Can you be more explicit how you fixed the problem ?
>>
>> Thanks
>> Dave Cramer
>>
>> da...@postgresintl.com
>> www.postgresintl.com
>>
>>
>> On Thu, 6 Sep 2018 at 03:46, Mate Varga <m...@matevarga.net> wrote:
>>
>>> After inlining the data, performance issues have been solved. Thanks for
>>> the help.
>>>
>>> On Mon, Sep 3, 2018 at 9:57 PM Mate Varga <m...@matevarga.net> wrote:
>>>
>>>> Thanks,
>>>> 1) we'll try to move stuff out from LOBs
>>>> 2) we might raise a PR for the JDBC driver
>>>>
>>>> Mate
>>>>
>>>> On Mon, 3 Sep 2018, 19:35 Dave Cramer, <p...@fastcrypt.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, 3 Sep 2018 at 13:00, Mate Varga <m...@matevarga.net> wrote:
>>>>>
>>>>>> More precisely: when fetching 10k rows, JDBC driver just does a large
>>>>>> bunch of socket reads. With lobs, it's ping-pong: one read, one write per
>>>>>> lob...
>>>>>>
>>>>>>
>>>>> Ok, this is making more sense. In theory we could fetch them all but
>>>>> since they are LOB's we could run out of memory.
>>>>>
>>>>> Not sure what to tell you at this point. I'd entertain a PR if you
>>>>> were motivated.
>>>>>
>>>>> Dave Cramer
>>>>>
>>>>> da...@postgresintl.com
>>>>> www.postgresintl.com
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> On Mon, Sep 3, 2018 at 6:54 PM Mate Varga <m...@matevarga.net> wrote:
>>>>>>
>>>>>>> So I have detailed profiling results now. Basically it takes very
>>>>>>> long that for each blob, the JDBC driver reads from the socket then it
>>>>>>> creates the byte array on the Java side. Then it reads the next blob, 
>>>>>>> etc.
>>>>>>> I guess this takes many network roundtrips.
>>>>>>>
>>>>>>> On Mon, Sep 3, 2018 at 5:58 PM Dave Cramer <p...@fastcrypt.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 3 Sep 2018 at 10:48, Mate Varga <m...@matevarga.net> wrote:
>>>>>>>>
>>>>>>>>> That's 1690 msec (1.69 seconds, and that is how long it takes to
>>>>>>>>> fetch 20k (small-ish) rows without LOBs (LOBs are a few lines below 
>>>>>>>>> on the
>>>>>>>>> screenshot)
>>>>>>>>>
>>>>>>>>
>>>>>>>> that sound high as well!
>>>>>>>>
>>>>>>>> Something isn't adding up..
>>>>>>>>
>>>>>>>>
>>>>>>>> Dave Cramer
>>>>>>>>
>>>>>>>> da...@postgresintl.com
>>>>>>>> www.postgresintl.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Sep 3, 2018 at 4:40 PM Dave Cramer <p...@fastcrypt.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> the one you have highlighted ~1.69ms
>>>>>>>>>>
>>>>>>>>>> Dave Cramer
>>>>>>>>>>
>>>>>>>>>> da...@postgresintl.com
>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, 3 Sep 2018 at 10:38, Mate Varga <m...@matevarga.net> wrote:
>>>>>>>>>>
>>>>>>>>>>> Which frame do you refer to?
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 3, 2018 at 3:57 PM Dave Cramer <p...@fastcrypt.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Not sure why reading from a socket is taking 1ms ?
>>>>>>>>>>>>
>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>
>>>>>>>>>>>> da...@postgresintl.com
>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 3 Sep 2018 at 09:39, Mate Varga <m...@matevarga.net>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://imgur.com/a/ovsJPRv -- I've uploaded the profiling
>>>>>>>>>>>>> info (as an image, sorry). It seems this is a JDBC-level problem. 
>>>>>>>>>>>>> I
>>>>>>>>>>>>> understand that the absolute timing is not meaningful at all 
>>>>>>>>>>>>> because you
>>>>>>>>>>>>> don't know how large the resultset is, but I can tell that this 
>>>>>>>>>>>>> is only a
>>>>>>>>>>>>> few thousands rows + few thousand largeobjects, each largeobject 
>>>>>>>>>>>>> is around
>>>>>>>>>>>>> 1 kByte. (Yes I know this is not a proper use of LOBs -- it's a 
>>>>>>>>>>>>> legacy db
>>>>>>>>>>>>> structure that's hard to change.)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>> Mate
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Sep 3, 2018 at 11:52 AM Mate Varga <m...@matevarga.net>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> we'll try to test this with pure JDBC versus hibernate.
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Sep 3, 2018 at 11:48 AM Dave Cramer <p...@fastcrypt.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 03:55, Mate Varga <m...@matevarga.net>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Basically there's a class with a byte[] field, the class is
>>>>>>>>>>>>>>>> mapped to table T and the byte field is annotated with @Lob so 
>>>>>>>>>>>>>>>> it goes to
>>>>>>>>>>>>>>>> the pg_largeobject table.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, so hibernate is in the mix. I wonder if that is causing
>>>>>>>>>>>>>>> some challenges ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The DB is on separate host but relatively close to the app,
>>>>>>>>>>>>>>>> and I can reproduce the problem locally as well. One 
>>>>>>>>>>>>>>>> interesting bit is
>>>>>>>>>>>>>>>> that turning of SSL between the app and PSQL speeds up things 
>>>>>>>>>>>>>>>> by at least
>>>>>>>>>>>>>>>> 50%.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ah, one addition -- the binary objects are encrypted, so
>>>>>>>>>>>>>>>> their entropy is very high.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any chance you could write a simple non-hibernate test code
>>>>>>>>>>>>>>> to time the code ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> dave.cra...@crunchydata.ca
>>>>>>>>>>>>>>> www.crunchydata.ca
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Mate
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Sep 2, 2018 at 12:55 AM Dave Cramer <
>>>>>>>>>>>>>>>> p...@fastcrypt.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, 31 Aug 2018 at 10:15, Mate Varga <m...@matevarga.net>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I see -- we could try that, though we're mostly using an
>>>>>>>>>>>>>>>>>> ORM (Hibernate) to do this. Thanks!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Aug 31, 2018 at 3:57 PM Dmitry Igrishin <
>>>>>>>>>>>>>>>>>> dmit...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> пт, 31 авг. 2018 г. в 16:35, Mate Varga <m...@matevarga.net
>>>>>>>>>>>>>>>>>>> >:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Hi,
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > we're fetching binary data from pg_largeobject table.
>>>>>>>>>>>>>>>>>>> The data is not very large, but we ended up storing it 
>>>>>>>>>>>>>>>>>>> there. If I'm
>>>>>>>>>>>>>>>>>>> copying the data to a file from the psql console, then it 
>>>>>>>>>>>>>>>>>>> takes X time
>>>>>>>>>>>>>>>>>>> (e.g. a second), fetching it through the JDBC driver takes 
>>>>>>>>>>>>>>>>>>> at least 10x
>>>>>>>>>>>>>>>>>>> more. We don't see this difference between JDBC and 
>>>>>>>>>>>>>>>>>>> 'native' performance
>>>>>>>>>>>>>>>>>>> for anything except largeobjects (and bytea columns, for 
>>>>>>>>>>>>>>>>>>> the record).
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Does anyone have any advice about whether this can be
>>>>>>>>>>>>>>>>>>> tuned or what the cause is?
>>>>>>>>>>>>>>>>>>> I don't know what a reason of that, but I think it's
>>>>>>>>>>>>>>>>>>> reasonable and
>>>>>>>>>>>>>>>>>>> quite simple to call lo_import()/lo_export() via JNI.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can't imagine that's any faster. The driver simply
>>>>>>>>>>>>>>>>> implements the protocol
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do you have any code to share ? Any other information ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is the JDBC connection significantly further away network
>>>>>>>>>>>>>>>>> wise ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> da...@postgresintl.com
>>>>>>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Reply via email to