[SQL] Merge overlapping time-periods

2011-06-15 Thread Jira, Marcel
Hi!

Although I try for some time, I am not able to write an SQL-Query that can do 
the following:

I have a very big table (let's call it "mytable") with information like this:

ID  BEG  END
1   2000-01-01   2000-03-31
1   2000-04-01   2000-05-31
1   2000-04-15   2000-07-31
1   2000-09-01   2000-10-31
2   2000-02-01   2000-03-15
2   2000-01-15   2000-03-31
2   2000-04-01   2000-04-15
3   2000-06-01   2000-06-15
3   2000-07-01   2000-07-15

There's an ID and time periods defined by a start value (BEG) and an end value 
(END)

I want to merge all periods belonging to the same ID, iff their time periods 
are overlapping or in a direct sequence.

Therefore the result should somehow look like this:


ID  BEG  END

1   2000-01-01   2000-07-31

1   2000-09-01   2000-10-31

2   2000-01-15   2000-03-31

2   2000-04-01   2000-04-15

3   2000-06-01   2000-06-15

3   2000-07-01   2000-07-15

I tried using "WITH RECURSIVE" but I didn't succeed.

My server is PostgreSQL 8.4. Unfortunately I can't do anything like update or 
install some fancy module...

Thank you for your help!

Best regards,

Marcel Jira


[SQL] Select random lines of a table using a probability distribution

2011-07-13 Thread Jira, Marcel
Hi!

Let's consider I have a table like this

idqualificationgenderageincome

I'd like to select (for example 100) lines of this table by random, but the 
random mechanism has to follow a certain probability distribution.

I want to use this procedure to construct a test group for another selection.

Example:

I filter all lines having the qualification "plumber".
I get 50 different ids consisting of 40 males, 10 females and a certain age 
distribution.

I also get some information concerning the income of the plumbers.

Now I want to know if the income is more influenced by the gender and age 
distribution or by the qualification "plumber".

Therefore I would like to select a test group (of 50 or more) without any 
plumbers. This test group has to follow the same age and gender distribution.

Then I would be able to compare this groups income statistics with the plumbers 
income statistics.

Is this possible (and doable with reasonable effort) in PostgreSQL?

Thank you in advance.

Best regards,

Marcel Jira

? ~~~ * ~~~
? Mag. Marcel Jira
? Institut für Sozialpolitik, Wirtschaftsuniversität Wien
? +43 1 313 36-5890
? UZA IV, D 317
? http://www.wu.ac.at/sozialpolitik/team/wimi/jira
? ~~~ * ~~~



[SQL] RPostgreSQL Win32 package

2011-10-24 Thread Jira, Marcel
Hi!

I'd like to use the statistics software GNU R (http://www.r-project.org/) as a 
PostgreSQL client. Currently I use RpgSQL but this package seems to be fitting 
better for smaller projects. As my project is about really big amounts of data, 
the package RPostgreSQL (http://code.google.com/p/rpostgresql/) seems to be a 
better choice.

Unfortunately I only found the RPostgreSQL source but couldn't find a package 
for Win XP (32 bit). As my user rights are limited, I am also not able to 
compile the package myself.

Therefore I wanted to kindly ask if somebody has already compiled RPostgreSQL 
for Win 32 and would be so nice to provide it as a an R package for me.

Thank you in advance,

Marcel Jira