Re: strange results after setting utf8 -subj in openssl ca command

2012-07-30 Thread Pica Pica Contact
Look at this example:
$openssl x509 -subject -nameopt oneline,-esc_msb,utf8 -noout -in 13/13_cert.pem
... CN = 13#ტესტერიN13
$openssl x509 -subject -noout -in 13/13_cert.pem
... 
CN=13#\xE1\x83\xA2\xE1\x83\x94\xE1\x83\xA1\xE1\x83\xA2\xE1\x83\x94\xE1\x83\xA0\xE1\x83\x98N13

This certificate was signed by openssl ca without changing subject, 
and openssl req did not use BMPString and UCS-2 in this case. CN string 
contains Georgian  letters but numbers are in ASCII so it is UTF-8 in fact.

So why openssl ca decides to use BMPString format? Looks like 1-byte code 
strings can be used without violating ASN.1 standard



- Original Message -
From: Dave Thompson dthomp...@prinpay.com
To: openssl-users@openssl.org
Cc: 
Sent: Monday, July 30, 2012 6:47 AM
Subject: RE: strange results after setting utf8 -subj in openssl ca command

 From: owner-openssl-us...@openssl.org On Behalf Of Pica Pica Contact
 Sent: Saturday, 28 July, 2012 14:41


Note that X.509 certs (and ASN.1 generally) don't actually support 
UTF8. They support several 1-byte codes (some now obsolete), BMPString 
which is 2-byte UCS-2, and UniversalString which is 4-byte UCS-4.
I believe OpenSSL selects the smallest of these into which the 
specified (Unicode) codepoints fit, which in this case is UCS-2.

 After adding -nameopt oneline,-esc_msb,utf8 result looks fine
 
That should translate the Unicode to UTF8 and output it, and 
assuming your terminal handles UTF8 then yes it will be good

 I call X509_NAME_oneline() function inside my application to 
 get CN string, and application fails to convert number from 
 CN field to integer, because X509_NAME_oneline() returns 
 /CN=\x003\x000\x000\x000\x000\x00# instead of CN=3#
 
I'm pretty sure _oneline is what x509 -text without -nameopt uses.

 Probably I should use X509_NAME_print_ex(),
 
Or if you only want CN, you could get the raw CN item and its value 
out of the name structure which in OpenSSL is STACK_OF(X509_NAME_ENTRY).

 but I have doubts if this string encoding is correct and how 
 it would work with other software. For example, certtool from 
 GnuTLS outputs subject string in this way:
 $ certtool -i --infile 3.pem
 
 ...skipped...
 
     Subject: 
 CN=#003300300030003000300023044204350441044210e210d410e110e24e2d56fd
 ...skipped...
 
That apparently is dumping the UCS-2 bytes. Compare to above.

 There are no such problems in openssl req, I can set UTF8 
 strings with numbers in certificate requests and resulting 
 certificate is ok for me, but I need to ignore subject from 
 certificate requests and set my own value
 
 
 Is it possible to fix openssl ca command somehow to encode 
 numbers in UTF8 strings as strings, not numbers?

'ca' can only encode ASN.1 strings in the ways defined by ASN.1.
You must decode them accordingly.






Automated List Manager                          majord...@openssl.org

__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: strange results after setting utf8 -subj in openssl ca command

2012-07-30 Thread Dr. Stephen Henson
On Sun, Jul 29, 2012, Dave Thompson wrote:

 
 Note that X.509 certs (and ASN.1 generally) don't actually support 
 UTF8. They support several 1-byte codes (some now obsolete), BMPString 
 which is 2-byte UCS-2, and UniversalString which is 4-byte UCS-4.
 I believe OpenSSL selects the smallest of these into which the 
 specified (Unicode) codepoints fit, which in this case is UCS-2.
 

There is a UTF8String type which has been about for some time. OpenSSL for
certificate requests uses the smallest of a set of types determined by the
string_mask option in openssl.cnf. This is set to utf8only in OpenSSL 1.0.0
and later. 

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


RE: strange results after setting utf8 -subj in openssl ca command

2012-07-30 Thread Dave Thompson
 From: owner-openssl-us...@openssl.org On Behalf Of Pica Pica Contact
 Sent: Monday, 30 July, 2012 13:47

 Look at this example: snip
 This certificate was signed by openssl ca without changing subject, 
 and openssl req did not use BMPString and UCS-2 in this 
 case. CN string contains Georgian  letters but numbers are in 
 ASCII so it is UTF-8 in fact.
 
You're probably right. (To be positive, I'd check the req directly, 
not the x509 into which it is copied, because the copy *could* change 
the encoding as long as it doesn't change the canonical value. But 
I'd be surprised if it did. OTOH I've been surprised before.)

On rechecking I am reminded there *is* an ASN.1 type UTF8String, which 
I had forgotten when I answered before. Sorry for the misstatement.

 So why openssl ca decides to use BMPString format? Looks 
 like 1-byte code strings can be used without violating ASN.1 standard
 
So that is a valid question. (Well, pedantically UTF8 is a variable-byte 
code, not a 1-byte code, but it's clear what you mean.)

I've definitely looked at some code, but I don't remember exactly where  
(or when), that chooses based on the chars needed, something like: 
if all are printable use PrintableString, 
else if all are 1-byte use GeneralString, 
else if all are 2-byte/BMP use BMPString, else use UniversalString. 
I'm guessing logic like that was used, and it wouldn't choose UTF8 
even though UTF8 can represent all Unicode. You'll probably have to 
read the source or debug, unless someone else chips in.

If you don't need all the features of 'ca', like database and CRLs, 
you could try 'x509 -req -CA*' and see if it's different on this point.
That is a separate implementation of nearly-identical function.


__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


RE: strange results after setting utf8 -subj in openssl ca command

2012-07-29 Thread Dave Thompson
 From: owner-openssl-us...@openssl.org On Behalf Of Pica Pica Contact
 Sent: Saturday, 28 July, 2012 14:41

 My application uses X.509 certificates with commonName field 
 set to following format:
 
 number#UserName,

 Everything is ok when UserName is in ascii, but when I sign 
 new certificates using snip: ca ... -subj ... -utf8
 and subject contains non-ASCII characters in UTF-8 encoding, 
 the resulting certificate's CN looks this way:
 
 $ openssl x509 -in 3.pem -subject  -noout
 
 subject= 
 /CN=\x003\x000\x000\x000\x000\x00#\x04B\x045\x04A\x04B\x10\xE2
 \x10\xD4\x10\xE1\x10\xE2N-V\xFD
 
 Looks like string 3 is literally encoded as a sequence 
 of bytes with corresponding decimal values, not as sequence 
 of ASCII codes for characters 3, 0, 0,...

Nope. \xHH is exactly two hex digits for one byte. You have:
  '\x00' '3' '\x00' '0' ... '\x00' '#' '\x04' 'B' '\x04' '5' ... 
That is obviously the UCS-2 (BMPString) encoding of:
U+0033=digit3 U+0030=digit0,repeated4times U+0023=NumberSign 
U+0442=Cyrillic.SmallTE U+0435 U+0441 U+0442 U+10E2=Georgian.LetterTar 
U+10D4 U+10E1 U+10E2 U+4E2E=CJK.something U+56FD=CJK.something 

Note that X.509 certs (and ASN.1 generally) don't actually support 
UTF8. They support several 1-byte codes (some now obsolete), BMPString 
which is 2-byte UCS-2, and UniversalString which is 4-byte UCS-4.
I believe OpenSSL selects the smallest of these into which the 
specified (Unicode) codepoints fit, which in this case is UCS-2.

 After adding -nameopt oneline,-esc_msb,utf8 result looks fine
 
That should translate the Unicode to UTF8 and output it, and 
assuming your terminal handles UTF8 then yes it will be good

 I call X509_NAME_oneline() function inside my application to 
 get CN string, and application fails to convert number from 
 CN field to integer, because X509_NAME_oneline() returns 
 /CN=\x003\x000\x000\x000\x000\x00# instead of CN=3#
 
I'm pretty sure _oneline is what x509 -text without -nameopt uses.

 Probably I should use X509_NAME_print_ex(),
 
Or if you only want CN, you could get the raw CN item and its value 
out of the name structure which in OpenSSL is STACK_OF(X509_NAME_ENTRY).

 but I have doubts if this string encoding is correct and how 
 it would work with other software. For example, certtool from 
 GnuTLS outputs subject string in this way:
 $ certtool -i --infile 3.pem
 
 ...skipped...
 
     Subject: 
 CN=#003300300030003000300023044204350441044210e210d410e110e24e2d56fd
 ...skipped...
 
That apparently is dumping the UCS-2 bytes. Compare to above.

 There are no such problems in openssl req, I can set UTF8 
 strings with numbers in certificate requests and resulting 
 certificate is ok for me, but I need to ignore subject from 
 certificate requests and set my own value
 
 
 Is it possible to fix openssl ca command somehow to encode 
 numbers in UTF8 strings as strings, not numbers?

'ca' can only encode ASN.1 strings in the ways defined by ASN.1.
You must decode them accordingly.


__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


strange results after setting utf8 -subj in openssl ca command

2012-07-28 Thread Pica Pica Contact


My application uses X.509 certificates with commonName field set to following 
format:

number#UserName,

for example

12345#JohnSmith

Everything is ok when UserName is in ascii, but when I sign new certificates 
using this command, for example:

openssl ca -config ca_config.txt  -subj /CN=3#тестტესტ中国 -utf8 -batch 
-notext -out 3.pem -in /tmp/CSR-file

and subject contains non-ASCII characters in UTF-8 encoding, the resulting 
certificate's CN looks this way:

$ openssl x509 -in 3.pem -subject  -noout

subject= 
/CN=\x003\x000\x000\x000\x000\x00#\x04B\x045\x04A\x04B\x10\xE2\x10\xD4\x10\xE1\x10\xE2N-V\xFD

Looks like string 3 is literally encoded as a sequence of bytes with 
corresponding decimal values, not as sequence of ASCII codes for characters 
3, 0, 0,...
After adding -nameopt oneline,-esc_msb,utf8 result looks fine

$ openssl x509 -in 0/0_cert.pem -subject -nameopt oneline,-esc_msb,utf8 -noout

subject= CN = 3#тестტესტ中国


I call X509_NAME_oneline() function inside my application to get CN string, and 
application fails to convert number from CN field to integer, because 
X509_NAME_oneline() returns /CN=\x003\x000\x000\x000\x000\x00# instead of 
CN=3#

Probably I should use X509_NAME_print_ex(),

but I have doubts if this string encoding is correct and how it would work with 
other software. For example, certtool from GnuTLS outputs subject string in 
this way:
$ certtool -i --infile 3.pem

...skipped...

    Subject: 
CN=#003300300030003000300023044204350441044210e210d410e110e24e2d56fd
...skipped...

There are no such problems in openssl req, I can set UTF8 strings with 
numbers in certificate requests and resulting certificate is ok for me, but I 
need to ignore subject from certificate requests and set my own value


Is it possible to fix openssl ca command somehow to encode numbers in UTF8 
strings as strings, not numbers?
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org