Re: [SQL] UTF8 encoding and non-text data types

2008-01-15 Thread Gregory Stark
Joe [EMAIL PROTECTED] writes:

 Tom Lane wrote:
 Oh?  Interesting.  But even if we wanted to teach Postgres about that,
 wouldn't there be a pretty strong risk of getting confused by Arabic's
 right-to-left writing direction?  Wouldn't be real helpful if the entry
 came out as 4321 when the user wanted 1234.  Definitely seems like
 something that had better be left to the application side, where there's
 more context about what the string means.
   
 The Arabic language is written right-to-left, except ... when it comes to
 numbers.

I don't think that matters anyways. Unicode strings are always in logical
order, not display order. Displaying the string in the right order is up to
the display engine in the Unicode world-view.

I'm not sure what to think about this though. It may be that Arabic notation
are close enough that it would be straightforward (IIRC decimal notation was
invented in the Arabic world after all). But other writing systems have some
pretty baroque notations which would be far more difficult to convert.

If anything I would expect this kind of conversion to live in the same place
as things like roman numerals or other more flexible formatting.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] UTF8 encoding and non-text data types

2008-01-15 Thread John Hasler
Joe writes:
 The Arabic language is written right-to-left, except ... when it comes to
 numbers.

Perhaps they read their numbers right to left but use a little-endian
notation.
-- 
John Hasler 
[EMAIL PROTECTED]
Elmwood, WI USA

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Medi Montaseri
Thanks Steve,

Actually I do not insert text data into my numeric field.
As I mentioned given
create table t1 { name text, cost decimal }
then I would like to insert numeric data into column cost because then I
can later benefit from numerical operators like SUM, AVG, etc

More specifically, I am using HTML, Perl and PG. So from the HTML point of
view a textfield is just some strings. So my user would enter 12345 but
expressed in UTF8. Perl would get this and use DBI to insert it into PG

What I am experiencing now is that DB errors that I am trying to insert an
incorrect data into column cost which is numeric and the data is coming in
from HTML in UTF8

Mybe I have to convert it to ASCII numbers in Perl before inserting  them
into PG

Thanks
Medi

On Jan 13, 2008 8:51 PM, Steve Midgley [EMAIL PROTECTED] wrote:

 At 02:22 PM 1/13/2008, [EMAIL PROTECTED] wrote:
 Date: Sat, 12 Jan 2008 14:21:00 -0800
 From: Medi Montaseri [EMAIL PROTECTED]
 To: pgsql-sql@postgresql.org
 Subject: UTF8 encoding and non-text data types
 Message-ID:
 [EMAIL PROTECTED]
 
 I understand PG supports UTF-8 encoding and I have sucessfully
 inserted
 Unicode text into columns. I was wondering about other data types such
 as
 numbers, decimal, dates
 
 That is, say I have a table t1 with
 create table t1 { name text, cost decimal }
 I can insert UTF8 text datatype into this table with no problem
 But if my application attempts to insert numbers encloded in UTF8,
 then I
 get wrong datatype error
 
 Is the solution for the application layer (not database) to convert
 the
 non-text UTF8 numbers to ASCII and then insert it into database ?
 
 Thanks
 Medi

 Hi Medi,

 I have only limited experience in this area, but it sounds like you
 sending your numbers as strings? In your example:

 create table t1 { name text, cost decimal };

 insert into t1 (name, cost) values ('name1', '1');

 I can't think of how else you're sending numeric values as UTF8? I know
 that Pg will accept numbers as strings and convert internally (that has
 worked for me in some object relational environments where I don't
 choose to cope with data types), but I think it would be better if you
 simply didn't send your numeric data in quotations, whether as UTF8 or
 ASCII. If you don't have control over this layer (that quotes your
 values), then I'd say converting to ASCII would solve the problem. But
 better to convert to numeric and not ship quoted strings at all.

 I may be totally off-base and missing something fundamental and I'm
 very open to correction (by anyone), but that's what I can see here.

 Best regards,

 Steve




Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Tom Lane
Medi Montaseri [EMAIL PROTECTED] writes:
 More specifically, I am using HTML, Perl and PG. So from the HTML point of
 view a textfield is just some strings. So my user would enter 12345 but
 expressed in UTF8. Perl would get this and use DBI to insert it into PG

 What I am experiencing now is that DB errors that I am trying to insert an
 incorrect data into column cost which is numeric and the data is coming in
 from HTML in UTF8

 Mybe I have to convert it to ASCII numbers in Perl before inserting  them
 into PG

Uh, there is *no* difference between the ASCII and UTF8 representations
of decimal digits, nor of any other character that would be allowed in
input for a decimal field.  I can't tell what your problem really is,
but you have certainly misunderstood or misexplained it.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread dmp

Hi Steve,
Have you tried converting to a decimal type or cast for the cost field? 
If you
are gathering this data from a text field and  placing in a variable of 
type string
then using that variable in the insert statement it may be rejected 
because it is not
type decimal. This  has been my experience with trying to get input data 
from

user's textfields and placing in the db.

dana.


Thanks Steve,

Actually I do not insert text data into my numeric field.
As I mentioned given
create table t1 { name text, cost decimal }
then I would like to insert numeric data into column cost because 
then I can later benefit from numerical operators like SUM, AVG, etc


More specifically, I am using HTML, Perl and PG. So from the HTML 
point of view a textfield is just some strings. So my user would enter 
12345 but expressed in UTF8. Perl would get this and use DBI to insert 
it into PG


What I am experiencing now is that DB errors that I am trying to 
insert an incorrect data into column cost which is numeric and the 
data is coming in from HTML in UTF8


Mybe I have to convert it to ASCII numbers in Perl before inserting  
them into PG


Thanks
Medi


I understand PG supports UTF-8 encoding and I have sucessfully
inserted
Unicode text into columns. I was wondering about other data types such
as
numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8,
then I
get wrong datatype error

Is the solution for the application layer (not database) to convert
the
non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi

Hi Medi,

I have only limited experience in this area, but it sounds like you
sending your numbers as strings? In your example:

create table t1 { name text, cost decimal };

insert into t1 (name, cost) values ('name1', '1');

I can't think of how else you're sending numeric values as UTF8? I know
that Pg will accept numbers as strings and convert internally (that has
worked for me in some object relational environments where I don't
choose to cope with data types), but I think it would be better if you
simply didn't send your numeric data in quotations, whether as UTF8 or
ASCII. If you don't have control over this layer (that quotes your
values), then I'd say converting to ASCII would solve the problem. But
better to convert to numeric and not ship quoted strings at all.

I may be totally off-base and missing something fundamental and I'm
very open to correction (by anyone), but that's what I can see here.

Best regards,

Steve



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread dmp

Sorry this should have been addressed to Medi
dana.


Hi Steve,
Have you tried converting to a decimal type or cast for the cost 
field? If you
are gathering this data from a text field and  placing in a variable 
of type string
then using that variable in the insert statement it may be rejected 
because it is not
type decimal. This  has been my experience with trying to get input 
data from

user's textfields and placing in the db.

dana.


Thanks Steve,

Actually I do not insert text data into my numeric field.
As I mentioned given
create table t1 { name text, cost decimal }
then I would like to insert numeric data into column cost because 
then I can later benefit from numerical operators like SUM, AVG, etc


More specifically, I am using HTML, Perl and PG. So from the HTML 
point of view a textfield is just some strings. So my user would 
enter 12345 but expressed in UTF8. Perl would get this and use DBI to 
insert it into PG


What I am experiencing now is that DB errors that I am trying to 
insert an incorrect data into column cost which is numeric and the 
data is coming in from HTML in UTF8


Mybe I have to convert it to ASCII numbers in Perl before inserting  
them into PG


Thanks
Medi


I understand PG supports UTF-8 encoding and I have sucessfully
inserted
Unicode text into columns. I was wondering about other data types such
as
numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8,
then I
get wrong datatype error

Is the solution for the application layer (not database) to convert
the
non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi

Hi Medi,

I have only limited experience in this area, but it sounds like you
sending your numbers as strings? In your example:

create table t1 { name text, cost decimal };

insert into t1 (name, cost) values ('name1', '1');

I can't think of how else you're sending numeric values as UTF8? I know
that Pg will accept numbers as strings and convert internally (that has
worked for me in some object relational environments where I don't
choose to cope with data types), but I think it would be better if you
simply didn't send your numeric data in quotations, whether as UTF8 or
ASCII. If you don't have control over this layer (that quotes your
values), then I'd say converting to ASCII would solve the problem. But
better to convert to numeric and not ship quoted strings at all.

I may be totally off-base and missing something fundamental and I'm
very open to correction (by anyone), but that's what I can see here.

Best regards,

Steve




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Steve Midgley


On Jan 13, 2008 8:51 PM, Steve Midgley 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] wrote:
At 02:22 PM 1/13/2008, 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] 
wrote:

Date: Sat, 12 Jan 2008 14:21:00 -0800
From: Medi Montaseri mailto:[EMAIL PROTECTED] 
[EMAIL PROTECTED]

To: mailto:pgsql-sql@postgresql.orgpgsql-sql@postgresql.org
Subject: UTF8 encoding and non-text data types
Message-ID:
mailto:[EMAIL PROTECTED] 
  [EMAIL PROTECTED]


I understand PG supports UTF-8 encoding and I have sucessfully
inserted
Unicode text into columns. I was wondering about other data types 
such

as
numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8,
then I
get wrong datatype error

Is the solution for the application layer (not database) to convert
the
non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi

Hi Medi,

I have only limited experience in this area, but it sounds like you
sending your numbers as strings? In your example:

create table t1 { name text, cost decimal };

insert into t1 (name, cost) values ('name1', '1');

I can't think of how else you're sending numeric values as UTF8? I 
know
that Pg will accept numbers as strings and convert internally (that 
has

worked for me in some object relational environments where I don't
choose to cope with data types), but I think it would be better if you
simply didn't send your numeric data in quotations, whether as UTF8 or 


ASCII. If you don't have control over this layer (that quotes your
values), then I'd say converting to ASCII would solve the problem. But
better to convert to numeric and not ship quoted strings at all.

I may be totally off-base and missing something fundamental and I'm
very open to correction (by anyone), but that's what I can see here.

Best regards,

Steve
At 11:01 AM 1/14/2008, Medi Montaseri wrote:
Thanks Steve,

Actually I do not insert text data into my numeric field.
As I mentioned given
create table t1 { name text, cost decimal }
then I would like to insert numeric data into column cost because 
then I can later benefit from numerical operators like SUM, AVG, etc


More specifically, I am using HTML, Perl and PG. So from the HTML 
point of view a textfield is just some strings. So my user would enter 
12345 but expressed in UTF8. Perl would get this and use DBI to insert 
it into PG


What I am experiencing now is that DB errors that I am trying to 
insert an incorrect data into column cost which is numeric and the 
data is coming in from HTML in UTF8


Mybe I have to convert it to ASCII numbers in Perl before 
inserting  them into PG


Thanks
Medi


Hi Medi,

I agree that you should convert your values in Perl before handing to 
DBI. I'm not familiar with DBI but presumably if you're sending it UTF8 
values it's attempting to quote them or do something with them, that a 
numeric field in Pg can't handle. Can you trap/monitor the exact sql 
statement that is generated by DBI and sent to Pg? That would help a 
lot in knowing what it is doing, but I suspect if you just convert your 
numbers from the HTML/UTF8 source values into actual Perl numeric 
values and then ship to DBI you'll be better off. And you'll get some 
input validation for free.


I hope this helps,

Steve


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Medi Montaseri
Here is my traces from perl CGI code, I'll include two samples one in ASCII
and one UTF so we know what to expect

Here is actual SQL statement being executed in Perl and DBI. I do not quote
the numerical value, just provided to DBI raw.

insert into t1 (c1, cost) values ('tewt', 1234)
this works find
insert into t1 (c1, cost) values ('#1588;#1583;',
#1777;#1778;#1779;#1780;)
 DBD::Pg::db do failed: ERROR:  syntax error at or near ; at character 59,

And the PG log itself is very similar and says
ERROR:  syntax error at or near ; at character 59

Char 59 by the way is the first accurance of semi-colon as in #1; which
is being caught by PG parser.

Medi


On Jan 14, 2008 12:18 PM, Steve Midgley [EMAIL PROTECTED] wrote:


 On Jan 13, 2008 8:51 PM, Steve Midgley [EMAIL PROTECTED] wrote:
  At 02:22 PM 1/13/2008, [EMAIL PROTECTED] wrote:
 Date: Sat, 12 Jan 2008 14:21:00 -0800
 From: Medi Montaseri  [EMAIL PROTECTED]
 To: pgsql-sql@postgresql.org
 Subject: UTF8 encoding and non-text data types
 Message-ID:
   [EMAIL PROTECTED]
 
 I understand PG supports UTF-8 encoding and I have sucessfully
 inserted
 Unicode text into columns. I was wondering about other data types such
 as
 numbers, decimal, dates
 
 That is, say I have a table t1 with
 create table t1 { name text, cost decimal }
 I can insert UTF8 text datatype into this table with no problem
 But if my application attempts to insert numbers encloded in UTF8,
 then I
 get wrong datatype error
 
 Is the solution for the application layer (not database) to convert
 the
 non-text UTF8 numbers to ASCII and then insert it into database ?
 
 Thanks
 Medi

 Hi Medi,

 I have only limited experience in this area, but it sounds like you
 sending your numbers as strings? In your example:

 create table t1 { name text, cost decimal };

 insert into t1 (name, cost) values ('name1', '1');

 I can't think of how else you're sending numeric values as UTF8? I know
 that Pg will accept numbers as strings and convert internally (that has
 worked for me in some object relational environments where I don't
 choose to cope with data types), but I think it would be better if you
 simply didn't send your numeric data in quotations, whether as UTF8 or
 ASCII. If you don't have control over this layer (that quotes your
 values), then I'd say converting to ASCII would solve the problem. But
 better to convert to numeric and not ship quoted strings at all.

 I may be totally off-base and missing something fundamental and I'm
 very open to correction (by anyone), but that's what I can see here.

 Best regards,

 Steve
 At 11:01 AM 1/14/2008, Medi Montaseri wrote:
 Thanks Steve,

 Actually I do not insert text data into my numeric field.
 As I mentioned given
 create table t1 { name text, cost decimal }
 then I would like to insert numeric data into column cost because then I
 can later benefit from numerical operators like SUM, AVG, etc

 More specifically, I am using HTML, Perl and PG. So from the HTML point of
 view a textfield is just some strings. So my user would enter 12345 but
 expressed in UTF8. Perl would get this and use DBI to insert it into PG

 What I am experiencing now is that DB errors that I am trying to insert an
 incorrect data into column cost which is numeric and the data is coming in
 from HTML in UTF8

 Mybe I have to convert it to ASCII numbers in Perl before inserting  them
 into PG

 Thanks
 Medi


 Hi Medi,

 I agree that you should convert your values in Perl before handing to DBI.
 I'm not familiar with DBI but presumably if you're sending it UTF8 values
 it's attempting to quote them or do something with them, that a numeric
 field in Pg can't handle. Can you trap/monitor the exact sql statement that
 is generated by DBI and sent to Pg? That would help a lot in knowing what it
 is doing, but I suspect if you just convert your numbers from the HTML/UTF8
 source values into actual Perl numeric values and then ship to DBI you'll be
 better off. And you'll get some input validation for free.

 I hope this helps,

 Steve



Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Steve Midgley

At 12:43 PM 1/14/2008, Medi Montaseri wrote:
Here is my traces from perl CGI code, I'll include two samples one in 
ASCII and one UTF so we know what to expect


Here is actual SQL statement being executed in Perl and DBI. I do not 
quote the numerical value, just provided to DBI raw.


insert into t1 (c1, cost) values ('tewt', 1234)
this works find
insert into t1 (c1, cost) values ('#1588;#1583;', 
#1777;#1778;#1779;#1780;)
 DBD::Pg::db do failed: ERROR:  syntax error at or near ; at 
character 59,


And the PG log itself is very similar and says
ERROR:  syntax error at or near ; at character 59

Char 59 by the way is the first accurance of semi-colon as in #1; 
which is being caught by PG parser.


Medi


On Jan 14, 2008 12:18 PM, Steve Midgley 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] wrote:


On Jan 13, 2008 8:51 PM, Steve Midgley 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] wrote:
At 02:22 PM 1/13/2008, 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] 
wrote:

Date: Sat, 12 Jan 2008 14:21:00 -0800
From: Medi Montaseri mailto:[EMAIL PROTECTED] 
[EMAIL PROTECTED]

To: mailto:pgsql-sql@postgresql.orgpgsql-sql@postgresql.org
Subject: UTF8 encoding and non-text data types
Message-ID:
mailto:[EMAIL PROTECTED] 
  [EMAIL PROTECTED]


I understand PG supports UTF-8 encoding and I have sucessfully
inserted
Unicode text into columns. I was wondering about other data types 
such

as
numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8,
then I
get wrong datatype error

Is the solution for the application layer (not database) to convert 


the
non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi
Hi Medi,
I have only limited experience in this area, but it sounds like you
sending your numbers as strings? In your example:
create table t1 { name text, cost decimal };
insert into t1 (name, cost) values ('name1', '1');
I can't think of how else you're sending numeric values as UTF8? I 
know
that Pg will accept numbers as strings and convert internally (that 
has

worked for me in some object relational environments where I don't
choose to cope with data types), but I think it would be better if 
you
simply didn't send your numeric data in quotations, whether as UTF8 
or

ASCII. If you don't have control over this layer (that quotes your
values), then I'd say converting to ASCII would solve the problem. 
But

better to convert to numeric and not ship quoted strings at all.
I may be totally off-base and missing something fundamental and I'm
very open to correction (by anyone), but that's what I can see here.
Best regards,
Steve
At 11:01 AM 1/14/2008, Medi Montaseri wrote:
Thanks Steve,

Actually I do not insert text data into my numeric field.
As I mentioned given
create table t1 { name text, cost decimal }
then I would like to insert numeric data into column cost because 
then I can later benefit from numerical operators like SUM, AVG, etc


More specifically, I am using HTML, Perl and PG. So from the HTML 
point of view a textfield is just some strings. So my user would 
enter 12345 but expressed in UTF8. Perl would get this and use DBI to 
insert it into PG


What I am experiencing now is that DB errors that I am trying to 
insert an incorrect data into column cost which is numeric and the 
data is coming in from HTML in UTF8


Mybe I have to convert it to ASCII numbers in Perl before 
inserting  them into PG


Thanks
Medi


Hi Medi,

I agree that you should convert your values in Perl before handing to 
DBI. I'm not familiar with DBI but presumably if you're sending it 
UTF8 values it's attempting to quote them or do something with them, 
that a numeric field in Pg can't handle. Can you trap/monitor the 
exact sql statement that is generated by DBI and sent to Pg? That 
would help a lot in knowing what it is doing, but I suspect if you 
just convert your numbers from the HTML/UTF8 source values into actual 
Perl numeric values and then ship to DBI you'll be better off. And 
you'll get some input validation for free.


I hope this helps,

Steve


Hi Medi,

That structure for numeric values is never going to work, as best as I 
understand Postgres (and other sql pipes). You have to convert those 
UTF chars to straight numeric format. Hopefully that solves your 
problem? I hope it's not too hard for you to get at the code which is 
sending the numbers as UTF?


Steve




Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Tom Lane
Medi Montaseri [EMAIL PROTECTED] writes:
 insert into t1 (c1, cost) values ('tewt', 1234)
 this works find
 insert into t1 (c1, cost) values ('#1588;#1583;',
 #1777;#1778;#1779;#1780;)
  DBD::Pg::db do failed: ERROR:  syntax error at or near ; at character 59,

Well, you've got two problems there.  The first and biggest is that
#NNN; is an HTML notation, not a SQL notation; no SQL database is going
to think that that string in its input is a representation of a single
Unicode character.  The other problem is that even if this did happen,
code points 1777 and nearby are not digits; they're something or other
in Arabic, apparently.  So I think you've got a problem in your Unicode
conversions as well as a notational problem.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Tom Lane
Joe [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Well, you've got two problems there.  The first and biggest is that
 #NNN; is an HTML notation, not a SQL notation; no SQL database is going
 to think that that string in its input is a representation of a single
 Unicode character.  The other problem is that even if this did happen,
 code points 1777 and nearby are not digits; they're something or other
 in Arabic, apparently.
 
 Precisely. 1777 through 1780 decimal equate to code points U+06F1 
 through U+06F4, which correspond to the Arabic numerals 1 through 4.

Oh?  Interesting.  But even if we wanted to teach Postgres about that,
wouldn't there be a pretty strong risk of getting confused by Arabic's
right-to-left writing direction?  Wouldn't be real helpful if the entry
came out as 4321 when the user wanted 1234.  Definitely seems like
something that had better be left to the application side, where there's
more context about what the string means.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Joe

Tom Lane wrote:

Medi Montaseri [EMAIL PROTECTED] writes:
  

insert into t1 (c1, cost) values ('tewt', 1234)
this works find
insert into t1 (c1, cost) values ('#1588;#1583;',
#1777;#1778;#1779;#1780;)
 DBD::Pg::db do failed: ERROR:  syntax error at or near ; at character 59,



Well, you've got two problems there.  The first and biggest is that
#NNN; is an HTML notation, not a SQL notation; no SQL database is going
to think that that string in its input is a representation of a single
Unicode character.  The other problem is that even if this did happen,
code points 1777 and nearby are not digits; they're something or other
in Arabic, apparently.
  
Precisely. 1777 through 1780 decimal equate to code points U+06F1 
through U+06F4, which correspond to the Arabic numerals 1 through 4.


Joe

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] UTF8 encoding and non-text data types

2008-01-14 Thread Joe

Tom Lane wrote:

Oh?  Interesting.  But even if we wanted to teach Postgres about that,
wouldn't there be a pretty strong risk of getting confused by Arabic's
right-to-left writing direction?  Wouldn't be real helpful if the entry
came out as 4321 when the user wanted 1234.  Definitely seems like
something that had better be left to the application side, where there's
more context about what the string means.
  
The Arabic language is written right-to-left, except ... when it comes 
to numbers.


http://www2.ignatius.edu/faculty/turner/arabic/anumbers.htm

I agree that it's application specific.  The HTML/Perl script ought to 
convert to Western numerals.


Joe

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] UTF8 encoding and non-text data types

2008-01-13 Thread Steve Midgley

At 02:22 PM 1/13/2008, [EMAIL PROTECTED] wrote:

Date: Sat, 12 Jan 2008 14:21:00 -0800
From: Medi Montaseri [EMAIL PROTECTED]
To: pgsql-sql@postgresql.org
Subject: UTF8 encoding and non-text data types
Message-ID: 
[EMAIL PROTECTED]


I understand PG supports UTF-8 encoding and I have sucessfully 
inserted
Unicode text into columns. I was wondering about other data types such 
as

numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8, 
then I

get wrong datatype error

Is the solution for the application layer (not database) to convert 
the

non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi


Hi Medi,

I have only limited experience in this area, but it sounds like you 
sending your numbers as strings? In your example:



create table t1 { name text, cost decimal };


insert into t1 (name, cost) values ('name1', '1');

I can't think of how else you're sending numeric values as UTF8? I know 
that Pg will accept numbers as strings and convert internally (that has 
worked for me in some object relational environments where I don't 
choose to cope with data types), but I think it would be better if you 
simply didn't send your numeric data in quotations, whether as UTF8 or 
ASCII. If you don't have control over this layer (that quotes your 
values), then I'd say converting to ASCII would solve the problem. But 
better to convert to numeric and not ship quoted strings at all.


I may be totally off-base and missing something fundamental and I'm 
very open to correction (by anyone), but that's what I can see here.


Best regards,

Steve


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[SQL] UTF8 encoding and non-text data types

2008-01-12 Thread Medi Montaseri
I understand PG supports UTF-8 encoding and I have sucessfully inserted
Unicode text into columns. I was wondering about other data types such as
numbers, decimal, dates

That is, say I have a table t1 with
create table t1 { name text, cost decimal }
I can insert UTF8 text datatype into this table with no problem
But if my application attempts to insert numbers encloded in UTF8, then I
get wrong datatype error

Is the solution for the application layer (not database) to convert the
non-text UTF8 numbers to ASCII and then insert it into database ?

Thanks
Medi