: I started a quick hack to cut over DateFormatUtil's formatting to this
: one-liner:  DateTimeFormatter.ISO_INSTANT.format(d.toInstant());    and
        ...
: I'd love to just cut over to this but there are some slight differences we
: would see and I want to get people's opinion if any of these differences
: are a blocker:

For some historical context: the current format was choosen back in the 
days when the *only* way to get data in or out of Solr was XML, and the 
format was picked to be consistent with the best standard available at the 
time for representing moments in time in XML.

When we use javabin for communiating with clients, we use serialized 
versions of the Date objects that deserialize back to Date objects on the 
other side of the wire; and likewise if we added thrift, or avro, or 
whatever support for communicating with clients, i would be 100% in favor 
of following whatever standards those formats have for communicating 
moment in time info -- but even in all of those cases where there may be a 
protocol specific representation for moments in time over the wire, it 
still seems very important/useful for having a standard (default) format 
for dealing with dates represented as *strings* over the wire -- 
particularly when dealing with query parsing.


In general, i'm a *little* concerned about how tweaking the 
parsing/formatting of "dates as strings" will affect existing 
clients/users that are using generic/custom parsing/formatting code in 
various langauges (however small those tweaks may be)...

: * Milliseconds are 0 padded to 3 digits if the milliseconds is non-zero.
: Thus 30 milliseconds will have ".030" added on.  Our current formatting
: code does ".03".

...probably won't impact existing many existing users since they alreayd 
have to be prepared to parse up to 3 decimals, and (i'm assuming) the new 
parser you're suggesting we use in solr will be forgiving if they send a 
string with less.  but if someone was obeying the existing spec 
religiously they may have errors if we return "*.900"

: * Dates with years after '9999' (i.e. 10000 and beyond, >= 5 digit years)
: *must* have a leading '+' -- it is formatted with a "+" and if such a year
: is parsed it *must* have a "+" or there is an exception.  Currently a '+'
: would yield an exception and there is no "+" emitted.

...this could also easily be problematic for some people in practice.

: * Of course as mentioned, currently we don't support negative years
: (resulting in invisible errors mostly); we'd get this for free.

+1


: dissent on my proposal.  If there's a real reason to keep something
: consistent with current behavior, I'm sure we could complicate things
: further but it'd be great to simply use ISO_INSTANT exactly.

My personal preference would be to at least have some ability to enable a 
backcompat mode...

1) switch to the new syntax as you describe by default
2) support a request param option to force the old parsing code to be used 
(or perhaps just *emulated* by stripping off any leading + and trailing 
0 when formatting, and adding them if needed when parsing) ... this would 
be a fall back option for clients whos existing code is brittle, and 
wouldn't have to be something we optimize heavily -- people whould be 
encouraged to swith to the new format ASAP.
3) after backporting to 6x, remove support for the request param override 
in master (so it's not supported at all starting in 7x)

...i think as a baseline, that would be awesome -- but I wonder if there 
are bigger questions we should be asking, and a better overall "API" 
for how date formatting/parsing rules are choosen on a per-request basis?

ie: If we're going to change the rules of date parsing/formatting, 
should we rethink *all* the rules, rather then just tweaking one detail of 
rules that were written solely with communicating over XML in mind?


I mean ... i'm way out of the loop on the state of art in java.time.*, but 
even w/o knowing what all is supported/recommend there, i have to wonder 
if this would be a good oportunity to add general support for an 
'datetime.fmt' request param that can be multivalued to specify an ordered 
set of parsers to be used any tiem solr needs to parse a String specified 
by the client as a moment in time -- and using the first fmt specified any 
time a moment in time may need formatted to return to the user as a string

        ?



-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to