On 5/19/26 7:27 AM, Martin Mueller wrote:
I use Postgres with a GUI frontend (Aquafold) as a very large
spreadsheet on steroids that analyzes rare or defective spellings in a
corpus of 65,000 texts and1.5 billion words. I typically extract data
from the corpus with python scripts, turn them into tables and load them
into the database.
On my Mac with 32 GB of memory performance is OK with queries that
typically within seconds extract data rows from tables with up to ten
million rows. If the result set is large, I suspect that most of time
machine's time is spent displaying result sets. I have used indexing
sparingly. While it helps, the time savings often don't matter much.
This is going to need more information:
1) Postgres version.
2) The table schema including indexes.
3) An example of the query.
4) Where you are measuring the time.
5) The client you are displaying the results in.
I am thinking about scaling up to table with about 60 million rows. Are
there things to do or watch out for? Or should I proceed on the
assumption that that 60 million records are within scope and that the
added timecost is roughly linear?
Martin Mueller
Professor emeritus of English and Classics
Northwestern University
--
Adrian Klaver
[email protected]