Re: scaling up from t1n to 60 million records

Adrian Klaver Tue, 19 May 2026 07:45:28 -0700

On 5/19/26 7:27 AM, Martin Mueller wrote:

I use Postgres with a GUI frontend (Aquafold) as a very largespreadsheet on steroids that analyzes rare or defective spellings in acorpus of 65,000 texts and1.5 billion words. I typically extract datafrom the corpus with python scripts, turn them into tables and load theminto the database.
On my Mac with 32 GB of memory performance is OK with queries thattypically within seconds extract data rows from tables with up to tenmillion rows. If the result set is large, I suspect that most of timemachine's time is spent displaying result sets. I have used indexingsparingly. While it helps, the time savings often don't matter much.


This is going to need more information:

1) Postgres version.

2) The table schema including indexes.

3) An example of the query.

4) Where you are measuring the time.

5) The client you are displaying the results in.

I am thinking about scaling up to table with about 60 million rows. Arethere things to do or watch out for? Or should I proceed on theassumption that that 60 million records are within scope and that theadded timecost is roughly linear?
Martin Mueller

Professor emeritus of English and Classics

Northwestern University



--
Adrian Klaver
[email protected]

Re: scaling up from t1n to 60 million records

Reply via email to