On 5/19/26 7:27 AM, Martin Mueller wrote:
I use Postgres with a GUI frontend (Aquafold) as a very large spreadsheet on steroids that analyzes rare or defective spellings in a corpus of 65,000 texts and1.5 billion words.  I typically extract  data from the corpus with python scripts, turn them into tables and load them into the database.


On my Mac with 32 GB of memory performance is OK with queries that typically within seconds extract data rows from tables  with up to ten million rows.  If the result set is large, I suspect that most of time machine's time is spent displaying result sets. I have used indexing sparingly. While it helps, the time savings often don't matter much.

This is going to need more information:

1) Postgres version.

2) The table schema including indexes.

3) An example of the query.

4) Where you are measuring the time.

5) The client you are displaying the results in.



I am thinking about scaling up to table with about 60 million rows.  Are there things to do or watch out for? Or should I proceed on the assumption that that 60 million records are within scope and that the added timecost is roughly linear?

Martin Mueller

Professor emeritus of English and Classics

Northwestern University



--
Adrian Klaver
[email protected]


Reply via email to