[ 
https://issues.apache.org/jira/browse/SOLR-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655084#comment-14655084
 ] 

Uwe Schindler commented on SOLR-7770:
-------------------------------------

Hi,
https://bugs.openjdk.java.net/browse/JDK-8130845 gives the following:

In fact the parsing of weekday or month names in the root locale was a bug in 
earlier Java versions. The root locale has accoring to unicode Month names like 
"M01", "M02",... - but no english month names. Same with weekdays.

Using the root locale is fine for parsing ISO formatted dates, but some of the 
formats are clearly "english" e.g. the "Cookie" or 
{{java.util.Date#toString()}} format. In Solr we should therefore change those 
SimpleDateFormats using english names while parsing to use {{Locale.ENGLISH}}.

In JDK 9, they fixed the problem, but we are still not 100% correct. I checked 
the CLDR locale data, in fact it has no month names, only those "pseudo names". 
Otherwise this may break again in later versions or for people using ICU SPIs 
for timezones or locales.

I will provide a patch for those date formats, which use english names later (I 
am currently on vacation, so don't hurry!). We should fix this in 5.3.

> Date field problems using ExtractingRequestHandler and java 9 (b71)
> -------------------------------------------------------------------
>
>                 Key: SOLR-7770
>                 URL: https://issues.apache.org/jira/browse/SOLR-7770
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> Tracking bug to note that the (Tika based) ExtractingRequestHandler will not 
> work properly with jdk9 starting with build71.
> This first manifested itself with failures like this from the tests...
> {noformat}
>    [junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=ExtractingRequestHandlerTest
> -Dtests.method=testArabicPDF -Dtests.seed=232D0A5404C2ADED 
> -Dtests.multiplier=3 -Dtests.slow=true
> -Dtests.locale=en_JM -Dtests.timezone=Etc/GMT-7 -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>    [junit4] ERROR   0.58s | ExtractingRequestHandlerTest.testArabicPDF <<<
>    [junit4]    > Throwable #1: org.apache.solr.common.SolrException: Invalid 
> Date String:'Tue Mar 09 13:44:49
> GMT+07:00 2010'
> {noformat}
> Workarround noted by Uwe...
> {quote}
> The test passes on JDK 9 b71 with:
> -Dargs="-Djava.locale.providers=JRE,SPI"
> This reenabled the old Locale data. I will add this to the build parameters 
> of policeman Jenkins to stop this from
> failing. To me it looks like the locale data somehow is not able to correctly 
> parse weekdays and/or timezones. I
> will check this out tomorrow and report a bug to the OpenJDK people. There is 
> something fishy with CLDR locale data.
> There are already some bugs open, so work is not yet finished (e.g. sometimes 
> it uses wrong timezone shortcuts,...)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to