Re: [GENERAL] Transparent i18n?

2005-07-07 Thread Karsten Hilbert
On Mon, Jul 04, 2005 at 03:27:59PM -0300, David Pratt wrote:

 I am also going to look at Karsten's material shortly to see how his system 
 works
I am still away from the net but here is how to find the
description in our Wiki:

Go to user support, user guide, scroll down do developers
guide, go to backend I18N.

Please point out anything you find difficult to figure out.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Transparent i18n?

2005-07-07 Thread David Pratt
Many thanks Karsten. I got a system working with arrays yesterday but 
will still be examining your code. I guess the next challenge is to see 
how well the multidimensional array can be searched.  I guess I could 
make indexes on an expression to retrieve language for a specific key 
since each element array of multidimensional array is a translation 
that includes the iso code and text of the translation.


It is pretty light and quick.  I am open to examining anything that 
will help me learn more about doing this well.


Regards,
David.


On Wednesday, July 6, 2005, at 11:19 AM, Karsten Hilbert wrote:


On Mon, Jul 04, 2005 at 03:27:59PM -0300, David Pratt wrote:

I am also going to look at Karsten's material shortly to see how his 
system works

I am still away from the net but here is how to find the
description in our Wiki:

Go to user support, user guide, scroll down do developers
guide, go to backend I18N.

Please point out anything you find difficult to figure out.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of 
broadcast)---

TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread Karsten Hilbert
On Sat, Jul 02, 2005 at 05:00:50PM -0300, David Pratt wrote:

 http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/server/sql/gmI18N.sql?rev=1.20content-type=text/vnd.viewcvs-markup
 Many thanks Karsten for some insight into how you are handling this.
David,

if you go to the Developers Corner in our Wiki at

 http://salaam.homeunix.com/twiki/bin/view/Gnumed/WebHome

you'll find an explanation of how we use this. Feel free to
ask for comments if that doesn't suffice.

(I am offline so can't give the precise URL.)

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread David Pratt

Many thanks, Karsten.  I am going to look at your example closely.

Regards
David

On Sunday, July 3, 2005, at 09:50 AM, Karsten Hilbert wrote:


On Sat, Jul 02, 2005 at 05:00:50PM -0300, David Pratt wrote:

http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/server/ 
sql/gmI18N.sql?rev=1.20content-type=text/vnd.viewcvs-markup

Many thanks Karsten for some insight into how you are handling this.

David,

if you go to the Developers Corner in our Wiki at

 http://salaam.homeunix.com/twiki/bin/view/Gnumed/WebHome

you'll find an explanation of how we use this. Feel free to
ask for comments if that doesn't suffice.

(I am offline so can't give the precise URL.)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of  
broadcast)---

TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread Greg Stark

I wonder if you could make an SQL type that used text[] as its storage format
but had an output function that displayed the correct text for the current
locale. Where current locale could be something you set by calling a
function at the beginning of the transaction.

Do pg_dump and all the important things use the send/receive functions not the
input/output functions? so even though this output function loses information
it wouldn't cause serious problems?

You would still need a way to retrieve all the languages for the cases like
administrative interfaces for updating the information. I'm not entirely
convinced this would be any better than the alternative of retrieving all of
them by default and having a function to retrieve only the correct language.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread David Pratt
Hi Greg. Not sure about this one since I have never made my own type.  
Do you mean like an ip to country type of situation to guess locale?  
If so, I am using a ip to country table to lookup ip from request and 
get the country so language can be passed automatically to display 
proper language (but I need some translation work done first before I 
can activate this).  I will also use this for black listing purposes 
and other things so multi purpose.


I have got a good part of what I wanted working so far.  I am just 
working on language update delete trigger since there does not appear 
to be a direct way of surgically removing a specific element from an 
array in postgres unless I have missed something.  For example if I 
knew spanish was 3rd array in my multi-dimensional array of say 10 
lang/translation arrays in the array containing all translations - to 
remove just this one without having rewrite the array and update the 
field (which is what I am hoping to complete today).


So my language update delete trigger needs to scan the array for 
lang/translation for deletion, update language key for each language 
from a reference field (other than for the language being deleted), 
rewrite the array without the lang/translation that was deleted, and 
then update the field with rewritten array.  Sounds worse that it 
really is since the multidimensional array containing each 
lang/translation array is same length and you are performing this by 
iterating with a loop through records in multi_language table. Further, 
each translation can be compared by key (for me this is the iso 
language code).  Also, realistically how many times do you need to add 
and drop languages.  And number of languages in use for me will likely 
never exceed say 20. So this process, even with large numbers of 
multi-language fields should not be that problematic even if you had 
say a few thousand text fields fields you wanted translations available 
for. I think you would still be looking at milliseconds to perform 
this. This will be an after type trigger (after deletion).  I guess I 
will see what performance is like when I am finished - so far it is 
pretty fast for adding.


You also have a sensible structure for multi_language fields where each 
one is referenced to multi_language table by id (normalized) with 
referential integrity (something I was seeking).  The only thing not 
normalized are translations which is okay to me since array structure 
is dynamic yet keys give you exactly what you want.  I am also going to 
look at Karsten's material shortly to see how his system works but I am 
interested in following through with what I started first with arrays 
approach since I am happy with what I am seeing.


Regards,
David

On Monday, July 4, 2005, at 12:06 PM, Greg Stark wrote:



I wonder if you could make an SQL type that used text[] as its storage 
format
but had an output function that displayed the correct text for the 
current

locale. Where current locale could be something you set by calling a
function at the beginning of the transaction.

Do pg_dump and all the important things use the send/receive functions 
not the
input/output functions? so even though this output function loses 
information

it wouldn't cause serious problems?

You would still need a way to retrieve all the languages for the cases 
like
administrative interfaces for updating the information. I'm not 
entirely
convinced this would be any better than the alternative of retrieving 
all of
them by default and having a function to retrieve only the correct 
language.


--
greg


---(end of 
broadcast)---

TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread Oleg Bartunov

Hi there,

sorry if just misunderstanding but we have contrib/hstore available from
http://www.sai.msu.su/~megera/postgres/gist/
which could be used for storing as many languages as you need.
It's sort of perl hash.

Oleg
On Mon, 4 Jul 2005, David Pratt wrote:

Hi Greg. Not sure about this one since I have never made my own type.  Do you 
mean like an ip to country type of situation to guess locale?  If so, I am 
using a ip to country table to lookup ip from request and get the country so 
language can be passed automatically to display proper language (but I need 
some translation work done first before I can activate this).  I will also 
use this for black listing purposes and other things so multi purpose.


I have got a good part of what I wanted working so far.  I am just working on 
language update delete trigger since there does not appear to be a direct way 
of surgically removing a specific element from an array in postgres unless I 
have missed something.  For example if I knew spanish was 3rd array in my 
multi-dimensional array of say 10 lang/translation arrays in the array 
containing all translations - to remove just this one without having rewrite 
the array and update the field (which is what I am hoping to complete today).


So my language update delete trigger needs to scan the array for 
lang/translation for deletion, update language key for each language from a 
reference field (other than for the language being deleted), rewrite the 
array without the lang/translation that was deleted, and then update the 
field with rewritten array.  Sounds worse that it really is since the 
multidimensional array containing each lang/translation array is same length 
and you are performing this by iterating with a loop through records in 
multi_language table. Further, each translation can be compared by key (for 
me this is the iso language code).  Also, realistically how many times do you 
need to add and drop languages.  And number of languages in use for me will 
likely never exceed say 20. So this process, even with large numbers of 
multi-language fields should not be that problematic even if you had say a 
few thousand text fields fields you wanted translations available for. I 
think you would still be looking at milliseconds to perform this. This will 
be an after type trigger (after deletion).  I guess I will see what 
performance is like when I am finished - so far it is pretty fast for adding.


You also have a sensible structure for multi_language fields where each one 
is referenced to multi_language table by id (normalized) with referential 
integrity (something I was seeking).  The only thing not normalized are 
translations which is okay to me since array structure is dynamic yet keys 
give you exactly what you want.  I am also going to look at Karsten's 
material shortly to see how his system works but I am interested in following 
through with what I started first with arrays approach since I am happy with 
what I am seeing.


Regards,
David

On Monday, July 4, 2005, at 12:06 PM, Greg Stark wrote:



I wonder if you could make an SQL type that used text[] as its storage 
format

but had an output function that displayed the correct text for the current
locale. Where current locale could be something you set by calling a
function at the beginning of the transaction.

Do pg_dump and all the important things use the send/receive functions not 
the
input/output functions? so even though this output function loses 
information

it wouldn't cause serious problems?

You would still need a way to retrieve all the languages for the cases like
administrative interfaces for updating the information. I'm not entirely
convinced this would be any better than the alternative of retrieving all 
of
them by default and having a function to retrieve only the correct 
language.


--
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
 choose an index scan if your joining column's datatypes do not
 match



Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Transparent i18n?

2005-07-04 Thread Greg Stark
Oleg Bartunov oleg@sai.msu.su writes:

 Hi there,
 
 sorry if just misunderstanding but we have contrib/hstore available from
 http://www.sai.msu.su/~megera/postgres/gist/
 which could be used for storing as many languages as you need.
 It's sort of perl hash.

Huh. That's pretty neat. I don't really need it since I can just assign fixed
array indexes for each locale and use arrays. But for someone who has to
support lots of different sets of locales it could be useful. Or for someone
who has to index these columns using gist.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Transparent i18n?

2005-07-02 Thread Greg Stark

David Pratt [EMAIL PROTECTED] writes:

 It was suggested that I look at an array.  

I think that was me. I tried not to say there's only one way to do it. Only
that I chose to go this way and I think it has worked a lot better for me.
Having the text right there in the column saves a *lot* of work dealing with
the tables. Especially since many tables would have multiple localized
strings.

 I think my table will be pretty simple;
 CREATE TABLE multi_language (
   id  SERIAL,
   lang_code_and_textTEXT[][]
 );
 
 So records would look like:
 
 1, {{'en','the brown cow'},{'fr','la vache brun'}}
 2, {{'en','the blue turkey'},{'fr','la dandon bleu'}}

That's a lot more complicated than my model.

Postgres doesn't have any functions for handling arrays like these as
associative arrays like you might want. And as you've discovered it's not so
easy to ship the whole array to your client where it might be easier to work
with.


I just have things like (hypothetically):

CREATE TABLE states (
  abbrevtext,
  state_nametext[],
  state_capitol text[]
)

And then in my application code data layer I mark all internationalized
columns and the object that handles creating the actual select automatically
includes a [$lang_id] after every column in that list.

The list of languages supported and the mapping of languages to array
positions is fixed. I can grow it later but I can't reorganize them. This is
fine for me since pretty much everything has exactly two languages. 

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Transparent i18n?

2005-07-02 Thread Karsten Hilbert
 SELECT language_text[1][1] AS language_code,
  language[1][2] AS text
 FROM language_text;

They way we do that in GNUmed:

 select lookup_val, _(lookup_val) from lookup_table where ...;

If you want to know how see here:

 
http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/server/sql/gmI18N.sql?rev=1.20content-type=text/vnd.viewcvs-markup

Feel free to ask for clarification.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Transparent i18n?

2005-07-02 Thread David Pratt

Many thanks Karsten for some insight into how you are handling this.

Regards,
David


On Saturday, July 2, 2005, at 06:08 AM, Karsten Hilbert wrote:


SELECT language_text[1][1] AS language_code,
 language[1][2] AS text
FROM language_text;


They way we do that in GNUmed:

 select lookup_val, _(lookup_val) from lookup_table where ...;

If you want to know how see here:

  
http://savannah.gnu.org/cgi-bin/viewcvs/gnumed/gnumed/gnumed/server/ 
sql/gmI18N.sql?rev=1.20content-type=text/vnd.viewcvs-markup


Feel free to ask for clarification.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

---(end of  
broadcast)---

TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Transparent i18n?

2005-07-02 Thread David Pratt
Hi Greg.  Well I'm kind of half way but I think what I am doing could 
work out.


I have an iso_languages table, a  languages table for languages used  
and a multi_language table
for storing values of my text fields.  I choose my language from 
iso_languages. Any table that needs a
multi_language field gets one by id with referential integrity with a 
multi_language table id since this is a direct
relationship.  Thanks for the idea of using array BTW. Referential 
integrity  could not work with my first model.


I am taking the array text and parsing the result in python to get the 
key positions.  This is possible
with a query using string_array function and getting text from any 
multi_language field. Then I put
result into a dictionary (an array) and get length and add one to get 
new key value that is added
when a new language is added. Using this key an array is added to 
existing array to each row of multi_language
table (in lang_code_and_text) field. So the length of the main array in 
multi-demensional array grows by one array

for the language for each record in multilanguage table.

I can also seek the english (en) value so that I will be able to use 
english as default  text for the new language
and inserting  a new array for that language into the 
lang_code_and_text array.  For example, if spanish (es)
added the new key is 3 so insert for each record so have something like 
this now:


1, {{'en','the brown cow'},{'fr','la vache brun'},{'es','the brown 
cow'}}
2, {{'en','the blue turkey'},{'fr','la dandon bleu'},{'es','the blue 
turkey'}}


In my forms, I am using a template to display entry fields for each 
language used. The english
will be default for new languages added so there is something in these 
fields to start with and it should
update properly based on correct key values. In my languages table, I 
am storing the current key positions for
each language used in my app. I have an i18 layer for zope  and based 
on language code I will

pass language id so you see right language in interface and data both.

When updating or deleting records, I am will be making a trigger to 
remove the array that represents a
translation after update. Then it has to update my language table to 
provide updated key values for my
languages. I am working on my first functions and triggers in plpgsql.  
This is where I may need help from the

list if I get stuck but so far so good!

Well so far so go but not finished yet. Does anyone have any comments 
on scalability.  I don't really see
a problem since there really is not any risk of my needing any more 
than 10 - 15  languages or so max out of maybe
300 languages in the world.  I think 15 entries in an array is very 
small so can't see any reason for this not to

work well.




I think my table will be pretty simple;
CREATE TABLE multi_language (
id  SERIAL,
lang_code_and_textTEXT[][]
);

So records would look like:

1, {{'en','the brown cow'},{'fr','la vache brun'}}
2, {{'en','the blue turkey'},{'fr','la dandon bleu'}}


That's a lot more complicated than my model.

Postgres doesn't have any functions for handling arrays like these as
associative arrays like you might want. And as you've discovered it's 
not so
easy to ship the whole array to your client where it might be easier 
to work

with.



Yes. This is a bit complicating since if they were there it would be 
really

nice to work with arrays.



I just have things like (hypothetically):

CREATE TABLE states (
  abbrevtext,
  state_nametext[],
  state_capitol text[]
)

And then in my application code data layer I mark all 
internationalized
columns and the object that handles creating the actual select 
automatically

includes a [$lang_id] after every column in that list.

The list of languages supported and the mapping of languages to array
positions is fixed. I can grow it later but I can't reorganize them. 
This is

fine for me since pretty much everything has exactly two languages.


That is pretty cool.  The only advantage in what I am doing will have 
is that you
will be able to add languages at any time and there will be no huge 
load on postgres
as far as I can tell since multilanguage table is a table is only two 
fields and one record for each
multi-language field referenced  from my other other tables and calls 
to it are

direct by id.  I think this should work but it is a puzzler for sure!

Regards,
David

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] Transparent i18n?

2005-06-30 Thread Steve - DND
I've recently been trying to implement some i18n functionality as simply as
possible into my application. I have a lot of lookup values and such in the
DB that need to be translated, and I would rather not do it in the calling
client.

A friend and I put our heads together, and came up which seemed like a
ridiculously elegant way of doing it using PGs SELECT rules, and some type
of connection session information. Well I got the connection session stuff
figured out thanks to Richard Huxton. Apparently though, SELECT rules can
only be used to create views and not actually modify the data returned by a
table?

I was trying to do something along the lines of a translations table, which
house all translations for a given string. My lookup tables would then
reference the specific strings they needed. I was then going to place a
SELECT rule on the lookup table that would perform a join(based off of the
lookup word, and the current session language), and return that in the field
of the lookup value.

Is my only option to use a separate view? What are some techniques the rest
of you use for storing translation info in the DB?

Thanks,
Steve



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Transparent i18n?

2005-06-30 Thread David Pratt
Hi Steve.  I have been a bit puzzling over a similar issue - not i18 
for interface but for text data and trying to sort out a solution so I 
will be interested to hear additional advice as well.  When I wrote to 
the list a couple of weeks back (look for my posting around the 17th) I 
was looking at doing something with a normalized structure but now I 
don't think this is going to work that well.  It was suggested that I 
look at an array.  I am looking at a multidimensional array to do this. 
 I am just reading up on postgres support for arrays.


I think my table will be pretty simple;
CREATE TABLE multi_language (
id  SERIAL,
lang_code_and_textTEXT[][]
);

So records would look like:

1, {{'en','the brown cow'},{'fr','la vache brun'}}
2, {{'en','the blue turkey'},{'fr','la dandon bleu'}}

I have another table with language codes ie en, fr, etc.  When 
languages are added, I would just append to array for whole table.  The 
trouble for me is more of getting the data out in postgres because 
retrieving the raw array will be incompatible syntax for python and I 
would have to manipulate results. Quite frankly I want for this to be 
done in Postgres so I only have to retrieve query results.  If I cant 
it would be a pain unless I can think of something else because the 
issue is going to be the keys and values in my languages table working 
with the array.


For example, if I have a serial table containing my languages and add 2 
entries english and french, I would then have two elements in my array 
and it wouldn't be so bad because I could use the id as a key to get 
the value back out through a query.  But say I delete french (and 
remove second element in entries for my table) and add spanish, now I 
have an language id of 3 and two elements in my array that can't call 
each other properly.  In python, arrays are called dictionaries and you 
can easily grab the an element doing something like 
lang_code_and_text['en'] to get the value of the en (english) key.


I was hoping you could call the multi-language text out of the array 
with a text key instead of a numeric index
but it appears Postgres will only let you do it this way or get results 
from slices as far as I can tell.  Maybe someone else on the list has 
some advice to offer here.


ie.

SELECT language_text[1][1] AS language_code,
 language[1][2] AS text
FROM language_text;


Regards,
David

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly