[HACKERS] xlogdump behaviour translating dropped relations

2006-07-26 Thread Diogo Biazus
I'm not sure about how the xlogdump should behave when translating oids of dropped relations.Now I'm just showing the oid, and ignoring the translation in these cases.But may be it's a good idea to giva a hint to the user (printing dropped).
Any advices?-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br


Re: [HACKERS] xlogdump behaviour translating dropped relations

2006-07-26 Thread Diogo Biazus
On 7/26/06, Tom Lane [EMAIL PROTECTED] wrote:
Diogo Biazus [EMAIL PROTECTED] writes: I'm not sure about how the xlogdump should behave when translating oids of dropped relations.I'm not sure it should be making any attempt to translate anything.
What makes you think the oids even refer to the current database?I'm getting a new database connection based on the dbNode of the current xlog record. And I expect that the user informed a connection to the backend that originated the xlog files. Is this not going to work?
I know that we can have some problem with the db user (if he does not have access to all databases), but we can advise the user on the docs about it.-- Diogo Biazus - 
[EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.brhttp://www.postgresql.org.br


Re: [HACKERS] xlogdump behaviour translating dropped relations

2006-07-26 Thread Diogo Biazus
On 7/26/06, Tom Lane [EMAIL PROTECTED] wrote:
Diogo Biazus [EMAIL PROTECTED] writes: On 7/26/06, Tom Lane [EMAIL PROTECTED] wrote: I'm not sure it should be making any attempt to translate anything.
 What makes you think the oids even refer to the current database? I'm getting a new database connection based on the dbNode of the current xlog record. And I expect that the user informed a connection to the backend
 that originated the xlog files. Is this not going to work?No.I for one need this thing to analyze WAL dumps from databasesI don't even have access to, let alone have up-and-running locally.It's really not going to be acceptable for the program to fail if it
can't have access to a database matching what it sees in the log.The dropped-relation problem is just the first manifestation you happento have run into of why that's an unacceptable restriction, but there
are other reasons.Agree, but this is only an option, and not even the default behavior. So the program could still be used to analyze logs without access to the database.But if you do have access to the cluster it might be useful to translate names if you are looking for some specific operation in order to recover a previous database state.
And if it isn't what other functionality would be good to help users determining a target xid for recovery?
-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.brhttp://www.postgresql.org.br



[HACKERS] xlogdump enhancements

2006-07-14 Thread Diogo Biazus
I'm working on Tom's xlogdump tool to add some functionality.IMHO some useful improvements would be and an idea of implementation:- have an options to output only the transactions with their status and some aggregate data (transaction size).
When the user pass a -t parameter, instead of printing echa record info the program uses a linked list to acumulate the total length and status info about each xid. The status begin with NOT COMMITED, it can change to COMMITED or ABORTED. If the user pass all the files in the pg_xlog directory as parameters he knows the current status of each transaction.
- Find loser transactions (not commited to the end of the log)I though of doing this for the only transactions option (see above).
- Have a filter to get a specifid rmidThere's always grep, but for windows users it might be interesting to filter records by rmid. There can be a flag like -r (from rmname), or -o (operation) where the user can pass a parameter like: xact, heap, etc...
And then filter the output records.- Option to translate OIDs to names given a database connectionI receive parameters to open a connection (the usual -h, -p and -U). In the startup I try to open a connection. If the user passes the parameters of the connection the application automaticaly enter in the translate oids mode. Then each data returning function tries to get from the connected database the objects name.
- Extract the exact SQL statement in cases of xlog generated by insert/update/delete.
This is the one where I need help from you the most. I'm not certain of how to implement. But as I can see, I'll need to get object's structure from the system catalog to decode the statement. So this option will be available only if you have a connection to the database. Is this right?
I'd like to have opinions on the subject. If somebody could give me some advice on how to start the last feature (decode the statemets) I'd apreciate very much.-- Diogo Biazus - 



[EMAIL PROTECTED]Móvel Consultoria



http://www.movelinfo.com.br


http://www.postgresql.org.br








Re: [HACKERS] xlogdump enhancements

2006-07-14 Thread Diogo Biazus
On 7/14/06, Jonah H. Harris [EMAIL PROTECTED] wrote:
On 7/14/06, Martijn van Oosterhout kleptog@svana.org wrote: If you really want to tackle this the hard way, find some other program that does it. Here one written in Perl that can decode most tuples, but
 not all. It fails because it doesn't recognise all the types.Yep Diogo, Martijn is correct.You have to reassemble the tuple andthen generate the appropriate SQL statement for it.For this, you'll
definitely need the catalog information to resolve relations, typehandling, etc.Thanks for the info, I'm taking a look in that program. Is good to have a starting point :-)-- Diogo Biazus - 
[EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.brhttp://www.postgresql.org.br



Re: [HACKERS] xlog viewer prototype and new proposal

2006-07-08 Thread Diogo Biazus
On 7/7/06, Martijn van Oosterhout kleptog@svana.org wrote:
Something I've been thinking of while reading this thread. One bigdisadvantage of doing it in the backend is that your methods ofreturning data are limited. Your resultset can only return one type.
For example, if you start decoding all the different types of xlogpackets, you're going to get different information for each. To displaythat as the output of a function you're going to have to munge theminto a common format. An external program does not suffer this
limitation.In the future it may be worthwhile making a library that can be used byboth an external program and the postgres backend, but really, thatseems a lot less work than doing the actual decoding itself...
Hope this helps,Sure, but in the backend idea it could be handled with a functions for the comon data, and other functions to extract especific data from diferent operations.But I was hoping to get more input about the functionality expected in the standalone tool, what could improve the existing xlogdump?
-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br


Re: [HACKERS] xlog viewer prototype and new proposal

2006-07-07 Thread Diogo Biazus
On 07 Jul 2006 09:58:29 -0400, Greg Stark [EMAIL PROTECTED]
 wrote:
Diogo Biazus [EMAIL PROTECTED] writes: I exposed the idea of bringing the xlogdump functionality to a backend
 module. The main drawback is the use case where the database is down. But
 the access to a failed cluster isn't impossible, just a little bit more dificult, requiring another cluster to be initialized.Does that mean you're planning to not use the backend's system tables at all?
You'll look at the database cluster under analysis to get all thatinformation?If so then that removes a lot of the objections to running in a backend.You're basically just using the backend as a convenient context for
manipulating and storing table-like data.It also seems to remove a lot of the motivation for doing it in the backend.You're not going to get any advantages on the implementation side in thatcase.

Yes, that's correct,  - I already have a database connection in cases where I want to translate
 oid to names.You can't do that if you want to allow people to initialize a new cluster toanalyze a downed cluster.Sure it would be possible to make the translations only in cases where the backend is the same generetaing the xlogs.
 - I can connect directly to the postgresql server if I want to query xlogs
 in a remote machine (don't need remote access to the system). - Easier to integrate with existing admin tools, like PgAdmin.These are unconvincing to non-windows people. In any case a stand-alone

program could always have a postgres module tacked on to call out to it.Sure, that's one of the solutions I was thinking about, now I see it can be the best one. Using just a backend interface to call the standalone tool.
What I still don't know is:Is it better to make this interface just calling the program and reading it's output than using a set of shared macros/functions?

That's the main reason I think a stand-alone module makes more sense. You canalways take a stand-alone module and stick an interface to it into the server.You can't take code meant to run in the server and build a stand-alone
environment to run it.Sure.On the last part off the proposal I've suggested some improvements to the stand-alone tool, any other ideas?-- Diogo Biazus - 

[EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br



[HACKERS] xlog viewer prototype and new proposal

2006-07-06 Thread Diogo Biazus
I've worked on a prototype (attached to this email) of the SRF function and I can query the xlog files for some useful info.I know that the error codes are still incorrect and the tests are missing, but this is only a proof of concept.
Examples of usage:Query for committed transactions on the xlog segment:postgres=# SELECT * from xlogviewer('/usr/local/pgsql/data/pg_xlog/00010003') where trim(infoname) = 'commit';
rmid | xid | rmname | info | len | total_len | infoname--+-++--+-+---+-- 1 | 4 | XACT | 0 | 12 | 40 | commit 1 | 5 | XACT | 0 | 12 | 40 | commit
...Query for the total length of transactions in the segment:postgres=# SELECT xid, sum(total_len) from xlogviewer('/usr/local/pgsql/data/pg_xlog/00010003') group by xid;xid | sum
-+-499 | 69054497 | 1460...I'll sumarize some relevant points of our previous thread (can be viewed at: http://archives.postgresql.org/pgsql-hackers/2006-06/msg01069.php
):I exposed the idea of bringing the xlogdump functionality to a backend module.The main drawback is the use case where the database is down.But the access to a failed cluster isn't impossible, just a little bit more dificult, requiring another cluster to be initialized.
I don't think that this is a no-go for the design, of course for the case where the database is down is still better to have a standalone tool.So there were suggestions to code something that could be used both on the frontend and the backend.
Tom said it would be difficult to use the same functions on backend and frontend.I think that I could use the same library and pass pointers to memory allocation and error reporting functions.Advantages in the SRF design:
- Using SQL to agregate and transform data in any way from the logs.- It's easier for the DBA in the other use cases where the cluster is still active.- I already have a database connection in cases where I want to translate oid to names.
- I can connect directly to the postgresql server if I want to query xlogs in a remote machine (don't need remote access to the system).- Easier to integrate with existing admin tools, like PgAdmin.In any case I need to start ASAP, to have something useful till the end of the google soc.
So if the way to go will be the standalone program, I think that I can enhance it by adding:- option to translate OIDs to names given a database connection- find loser transactions (not commited to the end of the log)
- have an options to output only the transactions with their status and some aggregate data (transaction size).- a CSV output (to read it's data on another programs, including the backend)There's one functionality I would like to work on but I don't how dificult it would be to acomplish in time:
- Extract the exact operation done in cases of xlog generated by insert/update/delete.-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoria
http://www.movelinfo.com.brhttp://www.postgresql.org.br


xlogviewer.tar.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] xlog viewer proposal

2006-06-25 Thread Diogo Biazus
Alright, I'm working on a fast prototype using the SRF.On 6/23/06, Simon Riggs [EMAIL PROTECTED] wrote:
On Fri, 2006-06-23 at 10:59 -0300, Diogo Biazus wrote: On 6/23/06, Simon Riggs 
[EMAIL PROTECTED] wrote:  - give more flexibility for managing the xlogs remotely Not sure what you mean.  - I think it's faster to implement and to have a working and
 usable  tool. Why do you think that? It sounds like you've got more work since you effectively need to rewrite the _desc routines.
 Yes, but I don't need to worry with program output, and I have the backend's memory management and error handling.I'd suggest doing a quick prototype to allow us to evaluate whicharchitecture would be preferable.
I'm torn between the good-idea and the safe-minimal-but-definitely in8.2 option.--Simon RiggsEnterpriseDB http://www.enterprisedb.com
-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br


Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Diogo Biazus
On 6/23/06, Simon Riggs [EMAIL PROTECTED]
 wrote:
 - give more flexibility for managing the xlogs remotelyNot sure what you mean. - I think it's faster to implement and to have a working and usable tool.Why do you think that? It sounds like you've got more work since you
effectively need to rewrite the _desc routines.Yes, but I don't need to worry with program output, and I have the backend's memory management and error handling.

 And there is one option to minimize the problem in the failed cluster case: the wrapper program could give the option to initdb a temporary area when no connection is given, creating a backend just to analyze a
 set of xlogs.It seems a reasonable assumption that someone reading PostgreSQL logswould have access to another PostgreSQL cluster. It obviously needs towork when the server that originated the logs is unavailable, but that
does not mean that all PostgreSQL systems are unavailable. There's noneed to try to wrap initdb - just note that people would have to haveaccess to a PostgreSQL system.Yes, that's what I tought, wrap the initdb isn't needed but would make things easier for a newbie. 
 Other option is to start by the standalone tool and create a wrapper
 function inside postgresql that would just call this external program
 and extract data from the xlogs using this program's output (with some option to output all data in a CSV format).I think this idea is a good one, but we must also consider whether iscan be done effectively within the time available. Is this: can do now
or want to do in future?I think that could be done, I have some code to call external programs within the database ready. It would be one of the enhancements in the case we choose the standalone path. 
The alternative of reinforcing xlogdump needs to be considered morefully now and quickly, so coding can begin as soon as possible.
- Diogo: what additional things can you make xlogdump do?I could add options to display the data of the bkp bloks, add the missing rmids: RM_HASH_ID, RM_GIST_ID, RM_SEQ_ID. Make a options to query only the transaction info xids status. And make a contrib module that calls the xlogdump and parses the output.
- Tom: can you say more about what you'd like to see from a tool, tohelp Diogo determine the best way forward. What value can he add if you
have already written the tool?Some other considerations:The biggest difficulty is finding loser transactions - ones that havenot yet committed by the end of the log. You need to do this in both
cases if you want to allow transaction state to be determined preciselyfor 100% of transactions; otherwise you might have to have an Unknowntransaction state in addition to the others.Yes, this is one thing we have to do in any case. 
What nobody has mentioned is that connecting to a db to lookup tablenames from OIDs is only possible if that db knows about the set of
tables the log files refer to. How would we be certain that theOID-to-tablename match would be a reliable one?Good question, It seems to me that the only case where this have a trivial aswer is if your inside the backend querying the current xlog directory.
I'm still thinking about the solution for the other cases (inside or outside the backend).-- Diogo Biazus - 

[EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br



Re: [HACKERS] xlog viewer proposal

2006-06-23 Thread Diogo Biazus
On 6/23/06, Diogo Biazus [EMAIL PROTECTED] wrote:
On 6/23/06, Simon Riggs 
[EMAIL PROTECTED]
 wrote:
 - give more flexibility for managing the xlogs remotelyNot sure what you mean.I can connect to the server if I want to query xlogs in a remote machine.
If i depend on a standalone tool that reads the filesystem I'll need some kind of ssh access.Sometimes the DBA needs the info in the xlogs but doesn't have/want access to the SO.It's also easier to create a script that collect statistics about xlogs in machiune clusters.
-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br


[HACKERS] xlog viewer proposal

2006-06-22 Thread Diogo Biazus
I'm developing the summer of code project to create a xlog viewer.The tool we want to create is a DBA tool used for inspect the xlog files, looking for some operations, statistcs of database usage and status of transactions.
Some use cases:* Some user made a mistake and commited it to the database. Now the DBA wants to find out the exact xid to restore to a point in time where this mistake was yet to happen.* A long running transaction changed lots of data and we want to undo that. Show the most recent transactions, with their commit times and total size of WAL for each transaction, so we can see the big transaction's xid.
* How many write transactions per second am I getting?The idea I've been discussing with Simon Riggs is to create a set of functions that can be called from within the database.It seems that we would need to extract 2 relations from the log files: transactions and xlog entries.
The functions that would extract the entries could easily look into any xlog segment passed as a parameter and return a relation containing these entries. But the functions needed to extract the transactions would need to look untill the last segment of the xlog to know the status of every transaction.
The function to extract the xlog entries would read the XLogRecord structures (with all data associated), get a TupleDesc with get_call_result_type() and return a tuple for each XLogRecord.Another problem is how to present the specific data returned in each operation?
We can make functions to present this data in a humam readable format like: get_heap_data(), get_btree_data(), etc.For example: SELECT get_btree_data(data) FROM xlogviewer_file(xxx) WHERE operation = 'BTREE';
Other problem is how to sparete implict ABORTs from explict ones? It seems that will be necessary to have separate functions for transaction info extraction. As I wrote above, this functions would have to read the xlogs all the way to the present moment to know which transactions are implicitly aborted, and which ones are still active.
This design gives lots of flexibility, we can use all the SQL power to query the xlog files. The one drawback is that you have to use a working backend to inspect the xlog, but in cases where the database cluster is lost you could always use another cluster. It would be easy to create a wrapper program (like createdb and createuser) to connect to a database and return the xlog info.
Other advantage is the possibility of query remote xlogs trought a pg connection, it makes the remote managing easier and machine cluster managing easier to (don't have to make ssh accounts on all nodes or map a remote filesystem).
Oracle has a similar tool called logminer (there goes an article about it http://www.oracle.com/technology/oramag/oracle/05-jul/o45dba.html
). This postgresql xlogviewer would be also good for people migrating from oracle.Besides, if we create as a separate program this would imply having useful functions related to backend data (the xlogs) not available to other backend modules. It would be easier to create redundant code also. And I've read some emails about having already duplicate code for extracting text out of xlogs (the xlogs debug functions).
Why a contrib module?Because It sounds safer to me to create a contrib module and it seems that I wont need to change the existing backend code. So all the code I'm planning to write is new, and I wont need changes in the backend. Another bonus is not to bloat the backend and let this feature to be installed by those who really need it. Afterwards they can be integrated in the backend if needed. 
Given this design I would create some functions like (plus the data formating functions):xlogviewer_file(char *file_path)Return all entries in a given xlog segmentxlogviewer_file_offset(char *file_path, uint32 offset)
Return all entries in a given xlog segment from an offsetxlogviewer_dir(char *directory_path)Return all entries in all xlog segments inside a directoryxlogviewer_transactions(char *file_path)Return all transactions from the directory containing the segment passed as parameter starting from this segment.
One example of it's use:SELECT * FROM xlogviewer_file('0001000F') xlog WHERE xlog.operation = 'BTREE';The resultset would be something like:xlog_record | previous_xlog_record | xid | operation | data
-++---+-+---0/0220 | 0/01F0 | 4 | BTREE |We could also query for a list of committed transactions:SELECT 
xlog.xid, count(1), sum(xlog.record_size)FROM xlogviewer_file('0001000F') xlogWHERE xlog.operation = 'XACT' xlog.info = 'COMMIT'GROUP BY 
xlog.xid-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.br
http://www.postgresql.org.br


Re: [HACKERS] xlog viewer proposal

2006-06-22 Thread Diogo Biazus
Agree, the project must choose one path as the starting point. But the two options can be given in the long run.I still think that as a starting point the functions inside the database are a good option.The reasons are: 
- using SQL to agregate and transform data in any way from the logs.- it's easier for the DBA in the other use cases where the cluster is still active. - give more flexibility for managing the xlogs remotely
- I think it's faster to implement and to have a working and usable tool.And there is one option to minimize the problem in the failed cluster case: the wrapper program could give the option to initdb a temporary area when no connection is given, creating a backend just to analyze a set of xlogs.
After this summer project I could go on and try to use parts of this code to implement a realy standalone tool.

Other option is to start by the standalone tool and create a wrapper function inside postgresql that would just call this external program and extract data from the xlogs using this program's output (with some option to output all data in a CSV format).
On 6/22/06, Tom Lane [EMAIL PROTECTED] wrote:
Jonah H. Harris [EMAIL PROTECTED] writes: I think it should certainly be able to run on it's own, but it wouldn't be that hard to extend the functions so that they were usable
 from within the database or vice-versa.Yes it would.The most obvious point is that memory management anderror handling conventions inside the backend are quite different fromwhat you'd expect to employ in a standalone program.Also the means
you'd use for consulting the system catalogs (in that option to providereadable names for OIDs) are entirely different.I think asking for support of both environments is a good way to ensurethat this summer project doesn't get finished :-(
regards, tom lane-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoria
http://www.movelinfo.com.brhttp://www.postgresql.org.br


Re: [HACKERS] xlog viewer proposal

2006-06-22 Thread Diogo Biazus
Diogo, are you working from my old xlogdump hack?If so what version?I can send you the latest off-list.I add stuff to it periodically when
I need it, and I don't think I've published it lately.Yup, I've got a version that was posted here some time ago. If you could send me the latest version I would be very glad.
-- Diogo Biazus - [EMAIL PROTECTED]Móvel Consultoriahttp://www.movelinfo.com.brhttp://www.postgresql.org.br



Re: [HACKERS] [GENERAL] Anyone want to assist with the translation of the Advocacy

2002-10-04 Thread Diogo Biazus



Hi everyone,

Have just put together a prototype page to show off the multi-lingual
capabilities that the Advocacy sites' infrastructure has:

http://advocacy.postgresql.org/?lang=de

The text was translated to german via Altavista's Babelfish, so it's
probably only about 80% accurate, but it conveys the concept.

Is anyone interested in translating the English version to other
languages?  All Latin based languages should be fine (German, French,
Italian, Spanish, Portuguese, Turkish, Greek, etc).

If there's strong interest, then an interface to let volunteers
translators do it easily can be constructed over the next fortnight or
so.

:-)

Regards and best wishes,

Justin Clift

  

Justin,

I would be glad to translate it to brazilian portuguese.
Here we have a lot of companies starting to use PostgreSQL, including 
the one where I work and some of our clients.
It would be very nice to have this site translated.

Cheers,

-- 
Diogo de Oliveira Biazus
[EMAIL PROTECTED]
Ikono Sistemas e Automação
http://www.ikono.com.br



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster