Re: [sqlite] SQLITE_MAX_VARIABLE_NUMBER and .import for very wide file

2008-12-31 Thread Webb Sprague
See below.

On Sun, Dec 28, 2008 at 11:56 PM, Chris Wedgwood  wrote:
> On Sun, Dec 28, 2008 at 11:49:34PM -0800, Webb Sprague wrote:
>
>> I am sure there is a better way to deal with 12K rows by 2500 columns,
>> but I can't figure it out
>
> 2500 columns sounds like a nightmare to deal with
>
> could you perhaps explain that data layout a little?
>

It is a download of a huge longitudinal survey
(www.bls.gov/nls/nlsy79.htm) that has been converted out of the
proprietary format into SAS, and now I want to convert it into a
single SQLITE database per wave.  I will wind up connecting people by
ID across the waves to show patterns of moving etc...

For each wave/ table, each row describes contains integers that code
for information about a single respondent, such as age, whether
employed in June  (either zero or one), whether employed in July,
etc...  Since the NLSY doesn't do multiple tables, this is very much
NOT normalized.  What the codes mean is described in a separate
codebook (-5 = missing data, 1=living at home, etc).

There is a separate table for each wave (1979, 1980, ... 2006).

I have managed (just now) to get it working with a hacked version of
SQLITE.  Here is a meaningless query, just to confirm:

sqlite> select W0072400, count(*) as c  from data_stuff group by
W0072400 order by c desc limit 5;
0,9204
-5,2513
100,293
1,80
3,43
CPU Time: user 0.917062 sys 0.364962

Like I say, I may be going about it all wrong, but I can't run the
proprietary software on my Mac, and SQL makes me comfortable.  I hope
to pull out the data I want via SQL (a processed 1% of the total),
then run statistical analyses and graphics with R.

I am describing all this in hopes there is another quantitative
sociologist out there using SQLITE!

TIA
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_MAX_VARIABLE_NUMBER and .import for very wide file

2008-12-29 Thread Webb Sprague
>> I am sure there is a better way to deal with 12K rows by 2500 columns,
>> but I can't figure it out
>
> 2500 columns sounds like a nightmare to deal with
>
> could you perhaps explain that data layout a little?

It is a download of a huge longitudinal survey
(www.bls.gov/nls/nlsy79.htm) that has been converted out of the
proprietary format into SAS, and now I want to convert it into a
single SQLITE database per wave.  I will wind up connecting people by
ID across the waves to show patterns of moving etc...

For each wave/ table, each row describes contains integers that code
for information about a single respondent, such as age, whether
employed in June  (either zero or one), whether employed in July,
etc...  Since the NLSY doesn't do multiple tables, this is very much
NOT normalized.  What the codes mean is described in a separate
codebook (-5 = missing data, 1=living at home, etc).

There is a separate table for each wave (1979, 1980, ... 2006).

I have managed (just now) to get it working with a hacked version of
SQLITE.  Here is a meaningless query, just to confirm:

sqlite> select W0072400, count(*) as c  from data_stuff group by
W0072400 order by c desc limit 5;
0,9204
-5,2513
100,293
1,80
3,43
CPU Time: user 0.917062 sys 0.364962

Like I say, I may be going about it all wrong, but I can't run the
proprietary software on my Mac, and SQL makes me comfortable.  I hope
to pull out the data I want via SQL (a processed 1% of the total),
then run statistical analyses and graphics with R.

I am describing all this in hopes there is another quantitative
sociologist out there using SQLITE!

TIA
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_MAX_VARIABLE_NUMBER and .import for very wide file

2008-12-29 Thread Chris Wedgwood
On Sun, Dec 28, 2008 at 11:49:34PM -0800, Webb Sprague wrote:

> I am sure there is a better way to deal with 12K rows by 2500 columns,
> but I can't figure it out

2500 columns sounds like a nightmare to deal with

could you perhaps explain that data layout a little?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLITE_MAX_VARIABLE_NUMBER and .import for very wide file

2008-12-28 Thread Webb Sprague
Hi All,

What are the ramifications of increasing SQLITE_MAX_VARIABLE_NUMBER,
probably to ?  I am trying to import a csv file from the National
Longitudinal Study of Youth 79, and .import errors out, though
creating the table worked ok.

I am sure there is a better way to deal with 12K rows by 2500 columns,
but I can't figure it out  The special application for working
with NLSY only runs on Windows !  So I am trying to process SAS files
until I get to work next week.

TIA
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users