Welcome Matt!

One of the others is probably best qualified to answer your question, but
I'll chime in early with a couple of questions. The performance of merging
depends on many factors, including type of sketch and sketch size. I'm
assuming from the link you posted that you are dealing with Theta sketches,
for count unique operations. Can you confirm that? If so, what's the logK
you are using? What is the sketch size? Do you happen to know what
proportion of your sketches are in estimation mode vs exact mode?

Will

<http://www.verizonmedia.com>

Will Lauer

Senior Principal Architect, Audience & Advertising Reporting
Data Platforms & Systems Engineering

M 508 561 6427
1908 S. First St
Champaign, IL 61822

<http://www.facebook.com/verizonmedia>   <http://twitter.com/verizonmedia>
<https://www.linkedin.com/company/verizon-media/>
<http://www.instagram.com/verizonmedia>



On Fri, Jul 9, 2021 at 12:02 PM Matthew Farkas <[email protected]> wrote:

> Hi,
>
> My name is Matt and I'm a data engineer at Spotify. I'm testing out trying
> Data Sketches with Postgres, and running into some performance issues. I'm
> seeing merge times much slower than what I'm seeing in the docs here
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__datasketches.apache.org_docs_Theta_ThetaMergeSpeed.html&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=vGHo2vqhE2ZeS_hHdb4Y3eoJ4WjVKhEg5Xld1w9ptEQ&m=wfXanJfFTJqpoX0hDe-0GzEkE5YndUaxQMI4dCAQM3c&s=R8BDffIXwyiZ46IUKowhz2-gQqGfpM3u-KkwplE4Ing&e=>
>  (millions
> of sketches/sec).
>
> In my case, I've pre-computed many sketches, inserted then into PG, then
> I'm running queries in PG and doing the merging there. My hunch is that
> there's something wrong with my Postgres configs, which I've tried tweaking
> extensively but haven't been able to improve query time.
>
> My question is if anyone knows what type of performance can be expected in
> Postgres and if anyone has any examples/tips in general from their
> implementations.
>
> Also, this is my first message to this list, so please let me know if I
> should be directing it anywhere else!
>
> Thanks!!
> Matt
>
>
>
> *Matthew Z. Farkas*
>
> Data Science @ Spotify
> MS Northwestern University, BS Georgia Tech
>
> m: (770) 337-2709
> e: [email protected]
>
>
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_in_matthewzfarkas&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=vGHo2vqhE2ZeS_hHdb4Y3eoJ4WjVKhEg5Xld1w9ptEQ&m=wfXanJfFTJqpoX0hDe-0GzEkE5YndUaxQMI4dCAQM3c&s=WBAi_Zz2AI6QpCCX6AsWbHRrBwTG4JtAMLfzxzllOU4&e=>
>

Reply via email to