> Hello 'Liters!
>
> I'd like to move to SQLite from Postgres, but have two quick questions.
>
> I'm a social scientist looking to manipulate a large dataset (5 billion
> transactions, 700gb). I do not need multiple connections, and will only
> ever run one query at a time.
>
> I started in Postgres, but discovered that in Windows one cannot increase
> some of the per-query RAM memory caps above 2gb (I know -- I would love to
> switch operating systems, but it's beyond my control).  So I'm thinking of
> moving to SQLite.

So may I understand more fully.

1. If you perform a SELECT * FROM XXX_5B_ROW_TABLE
   you are unable to retrieve that result set becasue if exceeds per-query
RAM
   memory cap 2GB on Windows?

2. Or in trying to stuff that result set into a data structure you exceed the
   RAM allocation cap?

>
> Before I make the move, I was hoping you kind people could answer two quick
> questions for me:
>   -- am I going to have problems using all 16gb of ram on my Windows 8
> machine for data manipulations if I switch to SQLite? Or will SQLite set me
> free?

If 2. above I do not see how moving to SQLite is going to help you. As
indicated
below no tool is going to allow the manipulation of large datasets like
that in
memory. I have found in processing data that most software makes this
mistake.

The proper approach is to process that data in chunks for your visualation or
algorithmn.

>   -- Is there any reason I should NOT use SQLite for manipulation of large
> datasets like this (for example, pulling out unique pairs of transaction
> participants, averages across users, etc.)? All the literature I can find
> talks about SQL database choices for people setting up databases that will
> be queried by lots of people, and I just can't find any input for people
> like me who just want a data manipulation tool for data that's too big to
> read into RAM and manipulate with the usual suspects (R, Stata, Matlab,
> etc.).
>
> Thanks all!
> Nick

I have been working on a routine for the MyJSQLView project that could pull
a query from PostgreSQL and create a local database either in memory or file,
say SQLite, so that processing could take place for analysis.

The main reason this came about is because in processing large datasets from
a networked RDBS it became apparent that to speed up analysis it may be
easier
to have the query result stored locally in a memory or file database.

Dana M. Proctor
MyJSQLView Project Manager

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to