Re: [PERFORM] [HACKERS] OSS database needed for testing
Lamar, > I do still want to get CDBS in a PostgreSQL setup, with automatic nightly > import, at some point in time. Just probably not as quickly as Josh needs a > dataset to crank on. Oh, I don't know. I expect setting this up to take several weeks. And if we do the CDBS database as part of the test pack, then you can get my help without paying the $175/hour I normally charge . -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [HACKERS] OSS database needed for testing
On Friday 04 April 2003 14:54, Merlin Moncure wrote: > I can tell you, though; the land mobile database is much more > complicated. Getting it to run decently on pc hardware is a significant > engineering challenge. Then it sounds like it's a better fit for Josh's requirements. > ill-fated DTV rollout and the failed AM stereo. It's a conservative > industry. Tell me about it. Yet we get IBOCand the 30,000 translator apps in the one week window... Anyway, those topics more correctly belong to [EMAIL PROTECTED]; rather off-topic here. I do still want to get CDBS in a PostgreSQL setup, with automatic nightly import, at some point in time. Just probably not as quickly as Josh needs a dataset to crank on. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] [HACKERS] OSS database needed for testing
> I'm quite familiar with RadioSoft. Can't afford any of the software; > familiar > with the products... :-) > > I've been putting together open source tools to do much of the same stuff. > With the release of the FCC's Fortran source, I've been able to do > virtually > everything I need to do. > > But while the LMR dataset is larger, the MB dataset is just as varied. > I'm > interested in both, however. Peter's my father! At RadioSoft I worked over that same fortran code you talked about. We ported most of it to C. Of course, I would prefer not to write any software that would compete with my father's company, free or no :) I can tell you, though; the land mobile database is much more complicated. Getting it to run decently on pc hardware is a significant engineering challenge. OTOH, the broadcast database isn't too bad. For example, the last time I checked there were about 40k tv stations in the cdbs, compared with about 4m frequencies in the land mobile private. Also the lm database has a much wider scope of function. The requirements for broadcast have changed (like the nature of broadcast technology itself) very little in the last 50 years, except for the ill-fated DTV rollout and the failed AM stereo. It's a conservative industry. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] [HACKERS] OSS database needed for testing
On Friday 04 April 2003 14:23, Merlin Moncure wrote: > Up until about 6 months ago, I worked at a company called RadioSoft. > They are a provider of high quality database, engineering, and GIS > software. The company has its roots as source of engineering tools for > broadcast engineers. They currently offer several products and services > (including online web based database services), some of which are based > on postgres, some not. You might consider checking them out. I'm quite familiar with RadioSoft. Can't afford any of the software; familiar with the products... :-) I've been putting together open source tools to do much of the same stuff. With the release of the FCC's Fortran source, I've been able to do virtually everything I need to do. But while the LMR dataset is larger, the MB dataset is just as varied. I'm interested in both, however. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [HACKERS] OSS database needed for testing
Lamar, > Also check out the cdbs files (which contain the broadcast stuff as well as > more) at /pub/Bureaus/Mass_Media/Databases/cdbs/ (which I would be more > interested in doing, since I am a broadcast engineer by profession) Hey, if you're willing to do the text --> postgres conversions, I'll use whichever tables you want ... -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [HACKERS] OSS database needed for testing
Lamar Owen wrote: > Also check out the cdbs files (which contain the broadcast stuff as well > as > more) at /pub/Bureaus/Mass_Media/Databases/cdbs/ (which I would be more > interested in doing, since I am a broadcast engineer by profession) > -- Up until about 6 months ago, I worked at a company called RadioSoft. They are a provider of high quality database, engineering, and GIS software. The company has its roots as source of engineering tools for broadcast engineers. They currently offer several products and services (including online web based database services), some of which are based on postgres, some not. You might consider checking them out. RadioSoft's flagship product, ComStudy, is the #1 tool for broadcast engineers on the market. I happen to be intimately familiar with the cdbs. I suggested the land mobile stuff because it is (much) bigger and (much) more complicated, but mostly draws on the same concepts, like haat, etc. You might get a kick out of this project. The last project I did before leaving there was an online database of directional patterns in xml format. I also made up a xml schema for directional patterns hoping to get some standardization in that regard. You can see that in the free section of the RadioSoft web page. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] [HACKERS] OSS database needed for testing
On Friday 04 April 2003 11:47, Merlin Moncure wrote: > The location of the data of interest is at > /pub/Bureaus/Wireless/Databases/uls/. > wireless services. This includes most two way systems and point to > multipoint (microwave) but not broadcast (AM, FM, TV) and not advanced > radio. Also check out the cdbs files (which contain the broadcast stuff as well as more) at /pub/Bureaus/Mass_Media/Databases/cdbs/ (which I would be more interested in doing, since I am a broadcast engineer by profession) -- Lamar Owen WGCR Internet Radio 1 Peter 4:11 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [HACKERS] OSS database needed for testing
Josh Berkus wrote: > Cool. I'll tackle this in a week or two. Right now, I'm being paid to > convert a client's data and that'll keep me busy through the weekend ... I would suggest downloading the data now. I can help get you started with the create table statements and the import scripts. There are not very many ways to get the data in a reasonable timeframe: the spi functions or the copy command are a good place to start. Do not bother with running stuff through insert queries: take my word for it, it just won't work. Of course, if you use copy, you have to pre-format. Be aware that you will have many gigabytes (like more than 20) of data before you are done. Whatever you decide to do, document the process: the difficulty of getting large amounts of data into postgres quickly and easily has been a historical complaint of mine. Using mysql, it was a snap to get the data in but using *that* database I really felt it couldn't handle this much data. I can also get you started with some example queries that should be quite a challenge to set up to run quickly. After that, it's your ballgame. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [HACKERS] OSS database needed for testing
Merlin, > The fcc FTP site is ftp.fcc.gov > > The location of the data of interest is at > /pub/Bureaus/Wireless/Databases/uls/. Cool. I'll tackle this in a week or two. Right now, I'm being paid to convert a client's data and that'll keep me busy through the weekend ... -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [HACKERS] OSS database needed for testing
The fcc FTP site is ftp.fcc.gov The location of the data of interest is at /pub/Bureaus/Wireless/Databases/uls/. There are zip files (pipe delimited) in complete and the daily changed files in daily. Theres lots of info in documentation which includes excel spreadsheets of the schema. These will have to be converted to sql statemtents. The ULS is the database system that holds the data for Fixed and Mobile wireless services. This includes most two way systems and point to multipoint (microwave) but not broadcast (AM, FM, TV) and not advanced radio. The database is really a database of applications. It contains application data submitted by wireless applicants. There are two families of tables, prefixed with 'a' and 'l'. The 'a' tables stand for application records that are pending being granted by the fcc. The 'l' tables have received licenses and may or may not be operating. Combined, the 'a' and 'l' zipfiles represent a specific service. For example, 'a_micro' and 'l_micro' contain the applications and licensed data for microwave systems. The different services have slightly different layouts because they have different requirements. I strongly suggest looking at LMcomm and LMpriv first. These are the fixed land mobile systems, and 90% of the entire database. They also have identical layouts. There are a great deal of files in each zipfile, but here are the most interesting: hd: header data ad: application detail an: antenna data lo: location data fr: frequency data em: emission data There are others. I can help you write meaningful queries that are quite complex and will require optimization techniques. Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [HACKERS] OSS database needed for testing
Jeff, Mlw, > Absolutely. We could just use one large state or several small ones and > let folks download the whole thing if they wanted. Using that technique > you could control the size of the test quite closely and still make > something potentially quite valuable as a contribution beyond the bench. Hold on a second. The FCC database is still a better choice because it is more complex with a carefully defined schema. The Tiger database would be good for doing tests of type 1 and 3, but not for tests of types 2 and 4. It would certainly be interesting to use the Tiger database as the basis for an additional type of test: 6) Very Large Data Set: querying, then updating, 300+ selected rows from a 2,000,000 + row table. ... but I still see the FCC database as our best candidate for the battery of tests 1-5. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] [HACKERS] OSS database needed for testing
On Thu, Apr 03, 2003 at 17:19:13 -0500, mlw <[EMAIL PROTECTED]> wrote: > > I have a set of scripts, SQL table defs, a small C program, along with a > set of field with files that loads it into PGSQL using the "copy from > stdin" It works fairly well, but takes a good long time to load it all. > > Should I put it in the download section of my website? Yes. I would be interested in looking at it even if I don't use exactly the same way to do stuff. Taking a logn time to load the data into the database isn't a big deal for me. reading through the tiger (and FIPS) data documentation it seemed like there might be some gotchas in unusual cases and I am not sure the google contest program really handled things right so I would like to see another implementation. I am also interested in the data model as that will save me some time. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [HACKERS] OSS database needed for testing
Bruno Wolff III wrote: On Thu, Apr 03, 2003 at 13:26:01 -0500, [EMAIL PROTECTED] wrote: I don't know that it meets your criteria, but. I have a set of scripts and a program that will load the US Census TigerUA database into PostgreSQL. The thing is absolutely freak'n huge. I forget which, but it is either 30g or 60g of data excluding indexes. Are the data model or the loading scripts available publicly? I have the tiger data and a program that uses it to convert addresses to latitude and longitude, but I don't really like the program and was thinking about trying to load the data into a database and do queries against the database to find location. I have a set of scripts, SQL table defs, a small C program, along with a set of field with files that loads it into PGSQL using the "copy from stdin" It works fairly well, but takes a good long time to load it all. Should I put it in the download section of my website? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] [HACKERS] OSS database needed for testing
On Thu, Apr 03, 2003 at 13:26:01 -0500, [EMAIL PROTECTED] wrote: > I don't know that it meets your criteria, but. > > I have a set of scripts and a program that will load the US Census TigerUA > database into PostgreSQL. The thing is absolutely freak'n huge. I forget > which, but it is either 30g or 60g of data excluding indexes. Are the data model or the loading scripts available publicly? I have the tiger data and a program that uses it to convert addresses to latitude and longitude, but I don't really like the program and was thinking about trying to load the data into a database and do queries against the database to find location. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] OSS database needed for testing
Folks, Please pardon the cross-posting. A small group of us on the Performance list were discussing the first steps toward constructing a comprehensive Postgresql installation benchmarking tool, mostly to compare different operating systems and file systemsm but later to be used as a foundation for a tuning wizard. To do this, we need one or more real (not randomly generated*) medium-large database which is or can be BSD-licensed (data AND schema). This database must have: 1) At least one "main" table with 12+ columns and 100,000+ rows (each). 2) At least 10-12 additional tables of assorted sizes, at least half of which should have Foriegn Key relationships to the main table(s) or each other. 3) At least one large text or varchar field among the various tables. In addition, the following items would be helpful, but are not required: 4) Views, triggers, and functions built on the database 5) A query log of database activity to give us sample queries to work with. 6) Some complex data types, such as geometric, network, and/or custom data types. Thanks for any leads you can give me! (* To forestall knee-jerk responses: Randomly generated data does not look or perform the same as real data in my professional opinion, and I'm the one writing the test scripts.) -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] OSS database needed for testing
Josh Berkus wrote: > 1) At least one "main" table with 12+ columns and 100,000+ rows (each). > 2) At least 10-12 additional tables of assorted sizes, at least half of > which > should have Foriegn Key relationships to the main table(s) or each other. > 3) At least one large text or varchar field among the various tables. > > In addition, the following items would be helpful, but are not required: > 4) Views, triggers, and functions built on the database > 5) A query log of database activity to give us sample queries to work > with. > 6) Some complex data types, such as geometric, network, and/or custom data > types. > Might I recommend the FCC database of transmitters. Its publicly available via anonymous FTP, medium largish with tables running 100k -> 1m+ records, and demonstrates many interesting test cases. For example, lat/lon spatial queries (RTree vs. GIST) can be tested with a decent volume. Also it demonstrates a good example of the use of schemas. Email me if you want info. Format is pipe delimited (non quoted), and data turnover is < 1% a week. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])