Dan Sugalski wrote:
> getting back a full row as an array, getting back a full
row as a hash, and stuff like that. Nothing fancy, and nothing that high-level, but enough to work the basics without quite as manual work as the current libpg requires.
OK.
I am at the point now where I need to know what type of format you want the data to come out in.
We have the following options although some of them will be impractical in production. I can drop the data into any type of structure currently available to Parrot at least I am pretty sure I can.
I can create the entire dataset and return this out to the caller as a hash of arrays or some other structure. For large datasets this will be completey impractical but I am highlighting it as an option for testing or possibly avoiding multiple calls between Parrot and Any Old Language (AOL).
We can call a funtion to return the data in any format you want ie a single record per call gets passed back. This method will probably be the closest to most libraries in circulation and is the one that makes most sense to me. It could be extended to pass back N records depending on what the caller wants, this might be faster than making lots ot AOL calls to Parrot but would involve some more work on our part.
For later use would it make it easier for people at a higher abstraction
if some metadata gets passed about the data ie the very first row
returned contains an array of types subsequent calls will return. Perl
is lovely the way it converts types but this might not be very practical
for other languages that are a bit more strict about stuff like this. At
the moment I am using "strings" for all the data coming from the
database but this wont work for everyone. This needs to be decided now to avoid a re-write later. It would make my life easier if the guys at the top where to deal with type conversion but I am not sure this is good choice.
The following is what I have come up with to date as far as accessing data in Postgres is concerned. There is very little error handling in the library at the moment which is something that needs to be addressed but I can start work on that as soon as the API has been agreed on.
I am fishing for some feedback to see if this is suitable or if it needs to be changed. The following code is an example of extracting 10,000 rows with field names and types. The types are integers which are local to Postgres so we probably need to come up with an agreed format for type identifiers.
1 .pcc_sub _MAIN prototyped 2 .param pmc argv
The first lib is the standard pasm lib that ships with the parrot source. The second i simply a lib I have created to hold some function declarations etc.
3 .include "/home/parrot/parrot/library/postgres.pasm" 4 .include "/home/parrot/lib/postgreslib.imc" 5 .local string dbstring 6 dbstring = "host=host dbname=Forum user=u password=pass" 7 .local int answer 8 print "Entering Connect\n"
The call to connect makes whatever calls etc required to get a connection to the database.
9 10 .pcc_begin prototyped 11 .arg dbstring 12 .pcc_call connect 13 retconnect: 14 .result CONN 15 .result answer 16 .result message 17 .pcc_end
The MetaData hash contains various meta data about the connection ie filed types and names.
18 19 .local PerlHash MetaData 20 MetaData = new PerlHash 21 MetaData = global "MetaData" 22 23 .PRINT("Connection Message = ", message, " \n") 24 .PRINT("Connection state = ", answer, " \n") 25 eq -1, answer, fail 26 eq 1, answer, go 27 fail: 28 .PRINT("\n", message, "\n") 29 end 30 go: 31 32 .local string query 33 query = "select * from parrot" 34 35 print "Entering Send Query \n" 36 .pcc_begin prototyped 37 .arg CONN 38 .arg query 39 .pcc_call pqsendquery 40 pqsendquery: 41 .result message 42 .pcc_end
The pqgetresult call will populate the MetaData hash with details of the call.
43 44 .PRINT("Execution = ", message, "\n") 45 .pcc_begin prototyped 46 .arg CONN 47 .pcc_call pqgetresult 48 retrecords: 49 .pcc_end 50 .local int rowcounter 51 rowcounter = MetaData["ROWCOUNT"] 52 eq -1, rowcounter, finished 53
The following bit of code is here to test that fieldnames and types have been filled correctly.
54 55 .local int Oid_type 56 .local int onfield 57 onfield = 1 58 .local PerlArray TupleData 59 TupleData = new PerlArray 60 TupleData = MetaData["FIELDDATA"] 61 .local int fnum 62 63 fnum = MetaData["NFIELDS"] 64 65 .local string field_name 66 .local PerlArray FieldData 67 FieldData = new PerlArray 68 gettype: 69 FieldData = TupleData[onfield] 70 inc onfield 71 field_name = FieldData[0] 72 Oid_type = FieldData[1] 73 .PRINT("Field Name = ",field_name, "\n") 74 .PRINT("Field Type = ",Oid_type, "\n") 75 if onfield <= fnum goto gettype 76 77 .local PerlString value 78 value = new PerlString 79 rowcounter = 0
Once all the necessary calls and connections have been made we can then fetch individual rows of data. These come out in a PerlArray at the moment.
80 getnext: 81 onfield = 0 82 .pcc_begin prototyped 83 .pcc_call fetch 84 nextrow: 85 .result record 86 .result answer 87 .pcc_end 88 inc rowcounter 89 nextfield: 90 91 value = record[onfield] 92 .PRINT("", value, "#") 93 inc onfield 94 if onfield <= fnum goto nextfield 95 print "\n" 96 ne 0, answer, getnext 97 98 finished: 99 .PQclear(MetaData["RESULT"]) 100 .PQresultErrorMessage(MetaData["RESULT"], message) 101 .PRINT("\n\n Rows",rowcounter , "\n\n") 102 end 103 .end 104 ######################################## 105 .include "/home/parrot/lib/pgreslib.imc"
I am not to sure if this is the sort of thing that was wanted or not. If not let me know what needs to be changed and I will change it.
Harry
The following is the table that I am testing this against. There are
only very few of the basic types here although for what I have done at the moment the types have no real affect. This table is loaded with 10000 records (not realistic data).
Table "public.test" Column | Type | Modifiers ------------+-----------------------------+--------------- _key | integer | not null _bigint8 | bigint | _bool | boolean | _char | character(10) | _varchar | character varying(100) | _float8 | double precision | _int | integer | _float4 | real | _int2 | smallint | _text | text | _timestamp | timestamp without time zone | default now() Indexes: parrot_pkey primary key btree (_key)
For the speed freaks doing "select * from test"
real 0m0.997s user 0m0.630s sys 0m0.010s
Displaying all 10000 records to screen as follows
9996 9176 t a Varchar here 9176 9176 9176 9176 smallint <- Text here -> timestamp 2004-01-11 16:45:28.79144 9997 2182 t a Varchar here 2182 2182 2182 2182 smallint <- Text here -> timestamp 2004-01-11 16:45:28.79379 9998 4521 t a Varchar here 4521 4521 4521 4521 smallint <- Text here -> timestamp 2004-01-11 16:45:28.79614 9999 4152 t a Varchar here 4152 4152 4152 4152 smallint <- Text here -> timestamp 2004-01-11 16:45:28.79849
real 0m4.189s user 0m0.570s sys 0m0.280s
Any requests, pointers, advice, abuse or general chit chat welcome.
Harry Jackson