[ https://issues.apache.org/jira/browse/SOLR-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Smiley updated SOLR-9080: ------------------------------- Attachment: SOLR_9080_DateMath_should_not_use_Calendar_API.patch The test said (erroneously) all was well, when I changed the year to 1234 in most test methods, for any of two reasons (or both): the test itself uses SimpleDateFormatter (which uses Calendar) as a source of truth, and because it called directly into the Calendar based utility methods of DateMathParser instead of constructing a String date math expression. This patch addresses both of those issues in the test, and changes most years in the test to 1234. I left some where they were because specific dates were used to craft time zone offset functionality. And I fixed DateMathParser itself, which was kinda fun. I removed or made non-public some things that weren't being used outside of itself or the test. * Note that a {{Locale}} is no longer needed/used in this API and it's dubious if it ever had an effect before, at least based on a comment about impacting when a day of the week starts (who cares?). Only the DIH DateFormatEvaluator passes something other than Locale.ROOT: it uses Locale.ENGLISH with the ability to pick something else, and it's not evident it's tested. This patch is probably not the final patch as I want to change the DateMathParser's API that will affect some callers. I'm inclined to think DateMathParser should not be something constructed -- it just needs static methods. And switch away from java.util.TimeZone to java.time.ZoneId in the API. Maybe a separate issue for such things. > DateMath is broken before the year 1582 > --------------------------------------- > > Key: SOLR-9080 > URL: https://issues.apache.org/jira/browse/SOLR-9080 > Project: Solr > Issue Type: Bug > Reporter: David Smiley > Assignee: David Smiley > Fix For: 6.0 > > Attachments: SOLR_9080_DateMath_should_not_use_Calendar_API.patch > > > In Solr 6.0, dates are parsed using the Java 8 java.time API. It formerly > was parsed using java.util.SimpleDateFormat which uses > java.util.GregorianCalendar. I've learned that the java.time API does _not_ > switch to a different algorithm at the Gregorian Change Date (year 1582) > whereas GregorianCalendar does. A ramification of this is that the > milliseconds before epoch value is different between the APIs for dates prior > to this year. They both round-trip between themselves but not between each > other prior to this date. Thus, anyone indexing historical dates must > re-index when moving to Solr 6. > What was _not_ changed in the parsing code was Solr's date-math logic -- it > still uses the Calendar API. This works for dates after 1582 but before, > it'll introduce discrepancies. Here's an example showing weird behavior: > http://localhost:8983/solr/techproducts/select?facet.range.end=1400-01-01T00:00:00Z&facet.range.gap=%2B10YEARS&facet.range.start=1300-01-01T00:00:00Z/YEAR&facet.range=manufacturedate_dt&facet=on&indent=on&q=*:*&rows=0&wt=json > Note that the year 1300 rounded down to the year, becomes 1299 January 8th > (weird in and of itself) and that subsequent gaps start on the 9th. > {noformat} > "counts":[ > "1299-01-08T00:00:00Z",0, > "1309-01-09T00:00:00Z",0, > "1319-01-09T00:00:00Z",0, ... > {noformat} > This weirdness will show itself for units at the year or month level, but not > below that (from what I'm seeing). In other words, if facet.range.gap is at > this amount, or otherwise using the date math syntax to round or add a year > or month, there will be issues like this. Otherwise there doesn't seem to be > an issue. > I think the solution is clearly to switch the date math code to use the > java.time API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org