Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-17 Thread Alan Bateman

On 16/07/2012 19:42, Xueming Shen wrote:

Hi Alan,

I gave the continue -else if  a try, it appears the server vm 
obviously likes
the continue approach (it is consistent with what we experienced in 
the past
when we did similar approach for ascii, in which we separate/singled 
the ascii

path out). So I guess we probably want to keep the continue here.

Here are the data. dbcs2_new is to replace the continue with else if 
and the

dbcs2 is the continue.

http://cr.openjdk.java.net/~sherman/7183053/dbcs2_new
http://cr.openjdk.java.net/~sherman/7183053/dbcs2

-Sherman
Thanks for checking, I had expected it would ultimately get compiled to 
the same code but it seems not.


In that case the changes in the webrev look fine to me.

-Alan


Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-16 Thread Xueming Shen

Hi Alan,

I gave the continue -else if  a try, it appears the server vm 
obviously likes
the continue approach (it is consistent with what we experienced in 
the past
when we did similar approach for ascii, in which we separate/singled the 
ascii

path out). So I guess we probably want to keep the continue here.

Here are the data. dbcs2_new is to replace the continue with else if and the
dbcs2 is the continue.

http://cr.openjdk.java.net/~sherman/7183053/dbcs2_new
http://cr.openjdk.java.net/~sherman/7183053/dbcs2

-Sherman

On 07/13/2012 10:09 AM, Xueming Shen wrote:
In JDK7,  the decoder and encoder implementation of most of our 
single-byte charsets
and UTF-8 charset are optimized to implement the internal interfce 
sun.nio.cs.ArrayDecoder/
Encoder to provide a fastpath for String.getBytes(...) and new 
String(byte[]...) operations. I

have an old blog regarding this optimization at

https://blogs.oracle.com/xuemingshen/entry/faster_new_string_bytes_cs

This rfe, as the followup for above changes, is to implement 
ArrayDe/Encoder for most
of the sun.nio.cs.ext.DoubleByte based double-byte charsets. Here is 
the webrev


http://cr.openjdk.java.net/~sherman/7183053/webrev
I've taken a pass over this and it's great to see 
DoubleByte.Decoder/Encoder implementing 
sun.nio.cs.ArrayDecoder/Encoder. The results looks good too, a small 
number of regressions (Big5 at len=32 for example) but this is a 
micro benchmark and I'm sure there are fluctuations. I don't see 
anything obviously wrong with the EBCDIC changes I'd need a history 
book to remember how the shifts between DBCS and SBCS. I think our 
tests our good for this area so I'm happy. One minor nit is the 
continue in both encode methods, I think it would be cleaner to use 
else if (bb ... instead.



The continue might make the vm happy, but this is the code I did last 
Oct, so I might be

wrong. Will give a couple run later with else





Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-14 Thread Ulf Zibis

Am 13.07.2012 23:59, schrieb Dalibor Topic:

Tim is going through then issues in Bugzilla and closing the already fixed 
ones, and migrating the rest over to bug.sun.com in preparation for the the 
JIRA migration, afaik.


So if I understand Joe correct, the patches form bugs.openjdk.java.net have been copied to the 
appropriate bug.sun.com bugs, and later will be again publicly visible after migration to JIRA. So 
bugs.openjdk.java.net will be no more used as an interim solution to file patches.

Please correct me if I'm wrong.


I'm in a kind of vacation from OpenJDK development. One reason is the missing 
appreciation from official side. All attempts to at least have an account on 
http://cr.openjdk.java.net/, for easier publishing webrevs, have ended in *no 
answer* regardless my activity:
See [Who Sent It?] on: 
http://markmail.org/search/+list:net.java.openjdk.core-libs-dev

If this patch is eventually headed for JDK 7 Updates, I could grant you an 
Author role there, if you wish.


Thanks, that would be great. Please use the email of above From: field, not my 
old GMX address.

-Ulf



Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Alan Bateman

On 11/07/2012 00:11, Xueming Shen wrote:

Hi,

In JDK7,  the decoder and encoder implementation of most of our 
single-byte charsets
and UTF-8 charset are optimized to implement the internal interfce 
sun.nio.cs.ArrayDecoder/
Encoder to provide a fastpath for String.getBytes(...) and new 
String(byte[]...) operations. I

have an old blog regarding this optimization at

https://blogs.oracle.com/xuemingshen/entry/faster_new_string_bytes_cs

This rfe, as the followup for above changes, is to implement 
ArrayDe/Encoder for most
of the sun.nio.cs.ext.DoubleByte based double-byte charsets. Here is 
the webrev


http://cr.openjdk.java.net/~sherman/7183053/webrev
I've taken a pass over this and it's great to see 
DoubleByte.Decoder/Encoder implementing sun.nio.cs.ArrayDecoder/Encoder. 
The results looks good too, a small number of regressions (Big5 at 
len=32 for example) but this is a micro benchmark and I'm sure there are 
fluctuations. I don't see anything obviously wrong with the EBCDIC 
changes I'd need a history book to remember how the shifts between DBCS 
and SBCS. I think our tests our good for this area so I'm happy. One 
minor nit is the continue in both encode methods, I think it would be 
cleaner to use else if (bb ... instead.


I see in TestStringCoding.java that you've commented out the test that 
goes over the buffer limit - would I be correct to say that this isn't 
an issue and this happens with DB charsets today?


Ulf - you've got several patches to the double byte charsets and I 
wonder if you have cycles to try Sherman's patch with jdk8 to see if 
there is any more to be gained?


-Alan.



Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Xueming Shen

On 07/13/2012 05:19 AM, Alan Bateman wrote:

On 11/07/2012 00:11, Xueming Shen wrote:

Hi,

In JDK7,  the decoder and encoder implementation of most of our 
single-byte charsets
and UTF-8 charset are optimized to implement the internal interfce 
sun.nio.cs.ArrayDecoder/
Encoder to provide a fastpath for String.getBytes(...) and new 
String(byte[]...) operations. I

have an old blog regarding this optimization at

https://blogs.oracle.com/xuemingshen/entry/faster_new_string_bytes_cs

This rfe, as the followup for above changes, is to implement 
ArrayDe/Encoder for most
of the sun.nio.cs.ext.DoubleByte based double-byte charsets. Here is 
the webrev


http://cr.openjdk.java.net/~sherman/7183053/webrev
I've taken a pass over this and it's great to see 
DoubleByte.Decoder/Encoder implementing 
sun.nio.cs.ArrayDecoder/Encoder. The results looks good too, a small 
number of regressions (Big5 at len=32 for example) but this is a micro 
benchmark and I'm sure there are fluctuations. I don't see anything 
obviously wrong with the EBCDIC changes I'd need a history book to 
remember how the shifts between DBCS and SBCS. I think our tests our 
good for this area so I'm happy. One minor nit is the continue in both 
encode methods, I think it would be cleaner to use else if (bb ... 
instead.


The continue might make the vm happy, but this is the code I did last 
Oct, so I might be

wrong. Will give a couple run later with else



I see in TestStringCoding.java that you've commented out the test that 
goes over the buffer limit - would I be correct to say that this isn't 
an issue and this happens with DB charsets today?




This is also true for utf-8 I did last year, but utf-8 is excluded at 
the beginning of the test. For
SB, it takes the advantage that the output char[] should always be the 
same as the length
of the input bytes, so this can be checked at the very beginning 
together. For mb, to check
both sp and dp slow down the de/encoding (vm obviously does not like too 
many ifs). Given
this is an internal interface used exclusively by StringCoding, in which 
it has already
calculated the max buf to feed in, I think this is something that can be 
optimized.


-Sherman


Ulf - you've got several patches to the double byte charsets and I 
wonder if you have cycles to try Sherman's patch with jdk8 to see if 
there is any more to be gained?


-Alan.





Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Ulf Zibis

Am 13.07.2012 14:19, schrieb Alan Bateman:


Ulf - you've got several patches to the double byte charsets and I wonder if you have cycles to 
try Sherman's patch with jdk8 to see if there is any more to be gained?


First, most of those patches have been closed from official side, don't really 
understand why:
https://bugs.openjdk.java.net/buglist.cgi?query_format=advancedshort_desc_type=allwordssubstrshort_desc=long_desc_type=substringlong_desc=bug_file_loc_type=allwordssubstrbug_file_loc=status_whiteboard_type=allwordssubstrstatus_whiteboard=emailassigned_to1=1emailtype1=substringemail1=emailassigned_to2=1emailreporter2=1emailqa_contact2=1emailcc2=1emailtype2=substringemail2=ulf.zibisbugidtype=includebug_id=chfieldfrom=chfieldto=Nowchfieldvalue=cmdtype=doitorder=Importancefield0-0-0=nooptype0-0-0=noopvalue0-0-0=

Yes, I have many work prepared for performance tuning upon charsets, but I do not have JDK-8 
environment installed at the moment.


I'm in a kind of vacation from OpenJDK development. One reason is the missing appreciation from 
official side. All attempts to at least have an account on http://cr.openjdk.java.net/, for easier 
publishing webrevs, have ended in *no answer* regardless my activity:

See [Who Sent It?] on: 
http://markmail.org/search/+list:net.java.openjdk.core-libs-dev

-Ulf



Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Ulf Zibis

Am 13.07.2012 14:19, schrieb Alan Bateman:


Ulf - you've got several patches to the double byte charsets and I wonder if 
you have cycles to
try Sherman's patch with jdk8 to see if there is any more to be gained?


Additionally, I'm wondering, that EUC_TW is not part of the patch.
I've worked on that for some months last year, gaining an improvement of ~30 % but have no final 
result though.


-Ulf



Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Alan Bateman

On 13/07/2012 20:22, Ulf Zibis wrote:


First, most of those patches have been closed from official side, 
don't really understand why:
https://bugs.openjdk.java.net/buglist.cgi?query_format=advancedshort_desc_type=allwordssubstrshort_desc=long_desc_type=substringlong_desc=bug_file_loc_type=allwordssubstrbug_file_loc=status_whiteboard_type=allwordssubstrstatus_whiteboard=emailassigned_to1=1emailtype1=substringemail1=emailassigned_to2=1emailreporter2=1emailqa_contact2=1emailcc2=1emailtype2=substringemail2=ulf.zibisbugidtype=includebug_id=chfieldfrom=chfieldto=Nowchfieldvalue=cmdtype=doitorder=Importancefield0-0-0=nooptype0-0-0=noopvalue0-0-0= 

Sorry, I don't know, looks like Tim has been doing some cleanup. From a 
brief scan then at least some of these were fixed or closed a long time 
ago.


You do need author role to push webrevs to cr.openjdk.java.net, I had 
thought were had this already but it seems not.


-Alan.


Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Dalibor Topic
On 7/13/12 9:22 PM, Ulf Zibis wrote:
 Am 13.07.2012 14:19, schrieb Alan Bateman:

 Ulf - you've got several patches to the double byte charsets and I wonder if 
 you have cycles to try Sherman's patch with jdk8 to see if there is any more 
 to be gained?
 
 First, most of those patches have been closed from official side, don't 
 really understand why:
 https://bugs.openjdk.java.net/buglist.cgi?query_format=advancedshort_desc_type=allwordssubstrshort_desc=long_desc_type=substringlong_desc=bug_file_loc_type=allwordssubstrbug_file_loc=status_whiteboard_type=allwordssubstrstatus_whiteboard=emailassigned_to1=1emailtype1=substringemail1=emailassigned_to2=1emailreporter2=1emailqa_contact2=1emailcc2=1emailtype2=substringemail2=ulf.zibisbugidtype=includebug_id=chfieldfrom=chfieldto=Nowchfieldvalue=cmdtype=doitorder=Importancefield0-0-0=nooptype0-0-0=noopvalue0-0-0=

Tim is going through then issues in Bugzilla and closing the already fixed 
ones, and migrating the rest over to bug.sun.com in preparation for the the 
JIRA migration, afaik.

 I'm in a kind of vacation from OpenJDK development. One reason is the missing 
 appreciation from official side. All attempts to at least have an account on 
 http://cr.openjdk.java.net/, for easier publishing webrevs, have ended in *no 
 answer* regardless my activity:
 See [Who Sent It?] on: 
 http://markmail.org/search/+list:net.java.openjdk.core-libs-dev

If this patch is eventually headed for JDK 7 Updates, I could grant you an 
Author role there, if you wish.

cheers,
dalibor topic
-- 
Oracle http://www.oracle.com
Dalibor Topic | Principal Product Manager
Phone: +494089091214 tel:+494089091214 | Mobile: +491737185961 
tel:+491737185961
Oracle Java Platform Group

ORACLE Deutschland B.V.  Co. KG | Nagelsweg 55 | 20097 Hamburg

ORACLE Deutschland B.V.  Co. KG
Hauptverwaltung: Riesstr. 25, D-80992 München
Registergericht: Amtsgericht München, HRA 95603
Geschäftsführer: Jürgen Kunz

Komplementärin: ORACLE Deutschland Verwaltung B.V.
Hertogswetering 163/167, 3543 AS Utrecht, Niederlande
Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher

Green Oracle http://www.oracle.com/commitment Oracle is committed to 
developing practices and products that help protect the environment




Re: Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-13 Thread Joe Darcy

Hello,

On 7/13/2012 2:59 PM, Dalibor Topic wrote:

On 7/13/12 9:22 PM, Ulf Zibis wrote:

Am 13.07.2012 14:19, schrieb Alan Bateman:

Ulf - you've got several patches to the double byte charsets and I wonder if 
you have cycles to try Sherman's patch with jdk8 to see if there is any more to 
be gained?

First, most of those patches have been closed from official side, don't really 
understand why:
https://bugs.openjdk.java.net/buglist.cgi?query_format=advancedshort_desc_type=allwordssubstrshort_desc=long_desc_type=substringlong_desc=bug_file_loc_type=allwordssubstrbug_file_loc=status_whiteboard_type=allwordssubstrstatus_whiteboard=emailassigned_to1=1emailtype1=substringemail1=emailassigned_to2=1emailreporter2=1emailqa_contact2=1emailcc2=1emailtype2=substringemail2=ulf.zibisbugidtype=includebug_id=chfieldfrom=chfieldto=Nowchfieldvalue=cmdtype=doitorder=Importancefield0-0-0=nooptype0-0-0=noopvalue0-0-0=

Tim is going through then issues in Bugzilla and closing the already fixed 
ones, and migrating the rest over to bug.sun.com in preparation for the the 
JIRA migration, afaik.


That is correct.  I've asked Tim to go over the Bugzilla issues as part 
of the effort to move the JDK to JIRA:


https://blogs.oracle.com/darcy/entry/moving_monarchs_dragons

Bugzilla issues will be closed out if appropriate and patches, etc. 
moved from Bugzilla into the legacy bugtraq system so that they are then 
moved forward as a consequence of the bugtraq to JIRA migration.


HTH,

-Joe


Codereview request for 7183053: Optimize DoubleByte charset for String.getBytes()/new String(byte[])

2012-07-10 Thread Xueming Shen

Hi,

In JDK7,  the decoder and encoder implementation of most of our 
single-byte charsets
and UTF-8 charset are optimized to implement the internal interfce 
sun.nio.cs.ArrayDecoder/
Encoder to provide a fastpath for String.getBytes(...) and new 
String(byte[]...) operations. I

have an old blog regarding this optimization at

https://blogs.oracle.com/xuemingshen/entry/faster_new_string_bytes_cs

This rfe, as the followup for above changes, is to implement 
ArrayDe/Encoder for most
of the sun.nio.cs.ext.DoubleByte based double-byte charsets. Here is the 
webrev


http://cr.openjdk.java.net/~sherman/7183053/webrev

The results of the non -scientific benchmark StrCodingBenchmarkDB 
running on client
and server vm on my linux machine are included in docs_c (client) and 
docs_s(server)

below.

http://cr.openjdk.java.net/~sherman/7183053/dbcs_c
http://cr.openjdk.java.net/~sherman/7183053/dbcs_s

Numbers are the time spent on the decoding/encoding operations, the 
smaller the

better.

Thanks,
-Sherman