Re: [Haskell-cafe] Fwd: Data.Time

2011-07-02 Thread Yitzchak Gale
Ashley, heads up! I am CCing you on this message because
I think a problem has been found with Data.Time.Format.
Please see below.

Daniel Patterson wrote:
 I've found most of the time library to be quite useful, but
 the parsing to be worthless

The API for parsing and rendering time in Data.Time is
based on the standard API for that in C - like the libraries
in most languages. It's pretty standard stuff.

I'm sure it can be improved upon though. If you have a useful
alternative time parsing library, please release it on Hackage.

 Specifically, the formatTime function, if it ever strips out
 padding (by zeros or spaces), results in a time that is unparseable.
 The fact that formatTime and parseTime are not capable of being
 inverses of each other seems like a major flaw,

Well, it is possible to render dates and times in a way
that loses information, so you can't expect those to
be complete inverses of each other.

 when you think that this is not a parseable date:
 2011/1/30 (because the month must be padded by zeros).

Hmm, that does seem wrong. The C API allows that to be
parsed using the format %Y/%m/%d, since the leading zero
for %m and %d are optional when parsing.
See, for example,
http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html

I am including Ashley as a CC on this. Perhaps he will
fix it.

It looks like the function parseValue is wrong for the
following format letters: CdHIjmMSuwW.
For each of those, decimal n should replaced by
something that requires at least 1 digit and allows up to n.
Then the various instances of ParseTime should be
checked, but they should be OK because they all seem
to use read.

Thanks,
Yitz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: Data.Time

2011-07-02 Thread Ashley Yakeley

On 2011-07-02 13:22, Yitzchak Gale wrote:

Ashley, heads up! I am CCing you on this message because
I think a problem has been found with Data.Time.Format.


Thanks Yitzchak.


when you think that this is not a parseable date:
2011/1/30 (because the month must be padded by zeros).


Hmm, that does seem wrong. The C API allows that to be
parsed using the format %Y/%m/%d, since the leading zero
for %m and %d are optional when parsing.
See, for example,
http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html


This was fixed in time-1.2.0.5. From the haddock for parseTime:

Supports the same %-codes as formatTime, including %-, %_ and %0 
modifiers.


With ghci:

Prelude System.Locale Data.Time parseTime defaultTimeLocale 
%Y/%-m/%-d 2011/1/30 :: Maybe Day

Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package time-1.2.0.5 ... linking ... done.
Just 2011-01-30

--
Ashley Yakeley

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: Data.Time

2011-07-02 Thread Joey Hess
Yitzchak Gale wrote:
 The API for parsing and rendering time in Data.Time is
 based on the standard API for that in C - like the libraries
 in most languages. It's pretty standard stuff.
 
 I'm sure it can be improved upon though. If you have a useful
 alternative time parsing library, please release it on Hackage.

In the perl world, there is a Date::Parse library that I have
found very useful in many situations. It can parse any date format
you throw at it, fairly reliably[1].

The closest I have in Haskell is the attached module, which I used to parse
all dates in all Usenet posts between 1981 and 1991. While small portions
are specific to Usenet messages it does have quite a lot of generic
parsing code, and perhaps most usefully, a large number of date format
strings, and I have thought about making a library if I find time.
(Or anyone is welcome to take it and do so.)

It's based on Data.Time.Parse instead of Data.Time.Format.parseTime,
probably because I didn't notice the latter when I was writing it.

-- 
see shy jo

[1] m/d/y vs d/m/y being the main weakness of course
{- Date parsing. The really horribly ugly bit, as there were few standards
 - on the early usenet. -}

module MsgTime where

import Data.Time.Parse
import Data.String.Utils
import Data.Maybe
import Data.Time.LocalTime
import Data.Time.Clock.POSIX
import Data.Char

import Msg

type TimeZoneParser = String - Maybe (TimeZone, String)
type DateParser = String - Maybe (LocalTime, String)

data TimeParser = TimeParser {
	dateparser :: DateParser,
	timezoneparser :: TimeZoneParser,
	example :: String
}

instance Show TimeParser where
	show (TimeParser _ _ e) = TimeParser  ++ e

extractTime :: Msg - Either String POSIXTime
extractTime m
	| null date = case parseTime mid of
		Left _ - Left missing Date header
		Right t - Right t
	| otherwise = case parseTime date of
		Left e - Left e
		Right t - 
			if t = earliestSane
then Right t
else parseTime dateR -- try this instead
	where
		date = getHeader m (Header Date)
		dateR = getHeader m (Header Date-Received)
		mid = getHeader m (Header Message-Id)

parseTime :: String - Either String POSIXTime
parseTime s
	| null matches = Left $ cannot parse:  ++ s
	| otherwise = Right $ matches !! 0
	where
		matches = catMaybes $ map (applyParser s) parsers

{- Applies a date parser and a timezone parser to the string, only
 - succeeding if the entire string is consumed. -}
applyParser :: String - TimeParser - Maybe POSIXTime
applyParser s (TimeParser dp zp _) =
	case dp (strip s) of
		Nothing - Nothing
		Just (t, s') -
			case zp (map toLower $ strip s') of
Just (z, []) - Just $ toPOSIXTime $ ZonedTime t z
_ - Nothing

earliestSane :: POSIXTime
earliestSane = toPOSIXTime $
	ZonedTime (fst $ fromJust $ strptime fmt cutoff) Data.Time.LocalTime.utc
	where
		cutoff = 1980-01-01 01:01:01
		fmt = %Y-%m-%d %T

{- All the date formats you can shake a stick at.. and then some! -}
parsers :: [TimeParser]
parsers = 
	[ p anyzone %d %b %y %T		15 Jun 88 02:27:41 GMT
	, p anyzone %a, %d %b %y %T		Thu, 22 Jun 89 20:02:03 GMT
	, p anyzone %a, %d-%b-%y %T		Thu, 15-Jun-89 18:01:56 EDT
	, p anyzone %d %b %y %T		8 Jan 90 14:07:27 -0400
	, p anyzone %d %b %y %H:%M		4 Oct 89 19:56 GMT
	, p anyzone %a, %d %b %y %H:%M	Thu, 23 May 91 02:13 PDT
	, p anyzone %a, %d %b %Y %T		Thu, 23 May 1991 07:07:00 -0400
	, p anyzone %a, %d %b %Y %H:%M	Sat, 18 May 1991 17:28 CDT
	, p anyzone %d %b %Y %T		11 Apr 1991 12:02:01 GMT
	, p anyzone %d-%b-%y %H:%M		24-Mar-90 14:22 CST
	, p anyzone %d %b %y, %T		22 May 91, 16:31:37 EST
	, p anyzone %a, %d %b T  %T		Fri, 8 Feb T  09:49:39 EST
	, p anyzone %d %b %Y %H:%M		30 June 1991 17:15 -0400

	-- special cases
	, p (tzconst est) %a %b %d %T EST %Y	Tue Jan 11 12:44:36 EST 1983
	, p (tzconst est) %a %b %d %T EST %y	Tue Jan 11 12:44:36 EST 83
	, p (tzconst edt) %a %b %d %T EDT %Y	Tue Jan 11 12:44:36 EDT 1983
	, p (tzconst edt) %a %b %d %T EDT %y	Tue Jan 11 12:44:36 EDT 83
	, p (tzconst utc) %a %b %d %T GMT %Y	Thu Nov  1 23:14:37 GMT 1990
	, p (tzconst pdt) %d %b %y %T -7	11 Jun 91 15:41:21 -7

	-- dates with no timezone specified are guessed
	, p nozone %d %b %y %T		9 Jan 90 09:33:59
	, p nozone %d %b %Y %T		10 APR 1990 05:25:28
	, p nozone %a %b %d %T %Y		Fri Feb  6 00:19:47 1981
	, p nozone %a %b %d %T %y		Fri Feb  6 00:19:47 81
	, p nozone %Y-%m-%d %T		1981-11-12 18:31:01
	, p nozone %y-%m-%d %T		81-11-12 18:31:01
	, p nozone %a, %d %b %y %T		Sat, 13 Apr 91 08:37:57
	, p nozone %a, %d %b %Y %T		Sun, 16 Jun 1991 13:23:02
	, p nozone %d %b, %Y %T		1 May, 1991 00:00:00
	, p nozone %d %b %y %H:%M		8 Jan 88 18:03
	, p nozone %a, %d %b %y %H:%M		Wed, 29 May 91 17:14
	, p nozone 1 %b %d %T %Y		1 Jan 08 20:59:08 1991

	-- this has to come near the end, as it matches greedily
	, g nozone %a %b %d %T %Y (		Wed Oct 27 17:02:46 1982 (Tuesday)
	, g nozone %a, %d %b %y %T +		Tue, 21 May 91 16:46:01 +22323328

	-- extract date from message-id headers
	-- (used for messages with no Date field)
	, g 

Re: [Haskell-cafe] Fwd: Data.Time

2011-07-02 Thread Yitzchak Gale
Ashley Yakeley wrote:
 This was fixed in time-1.2.0.5. From the haddock for parseTime:

 Supports the same %-codes as formatTime, including %-, %_ and %0
 modifiers.

Great, glad to hear this was recently fixed. I installed
the latest version of the time package, and it works.

It is important to note that this works differently than the usual
strptime behavior, though. For example, %m in Data.Time is
an alias for %0m, whereas %m in strptime means the
same as %-m in Data.Time (optional leading zero).

Thanks,
Yitz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fwd: Data.Time

2011-07-02 Thread Ashley Yakeley

On 2011-07-02 16:35, Yitzchak Gale wrote:


It is important to note that this works differently than the usual
strptime behavior, though. For example, %m in Data.Time is
an alias for %0m, whereas %m in strptime means the
same as %-m in Data.Time (optional leading zero).


I made some changes from the C lib behaviour for consistency. In C, %m 
means %0m in strftime and %-m in strptime. I decided to make it 
%0m consistently. Also, at least in glibc, the %# modifier does not 
consistently convert to lower case. In Data.Time it does.


--
Ashley Yakeley

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Fwd: Data.Time

2011-06-27 Thread Daniel Patterson
sent from wrong account - message follows:

 I've found most of the time library to be quite useful, but the parsing to be 
 worthless (I've tried to get someone to prove me wrong already, and would be 
 happy if someone could on this thread!). 
 
 Specifically, the formatTime function, if it ever strips out padding (by 
 zeros or spaces), results in a time that is unparseable. The fact that 
 formatTime and parseTime are not capable of being inverses of each other 
 seems like a major flaw, when you think that this is not a parseable date:
 
 2011/1/30 (because the month must be padded by zeros).
 
 Even though it is very easy to print, and occurs commonly in the world.
 
 Because of this, I use formatTime to write my times, and then have a custom 
 parser to parse them back out. Which makes me think that this is a broken 
 library
 
 On Jun 27, 2011, at 10:37 AM, bri...@aracnet.com bri...@aracnet.com wrote:
 
 On Mon, 27 Jun 2011 11:15:28 +0300
 Yitzchak Gale g...@sefer.org wrote:
 
 
 The biggest shortcoming, in my opinion, is that the documentation
 assumes that the reader is very familiar with the Haskell type
 system, and with viewing type signatures and instance lists as an
 integral and central part of the documentation.
 
 In particular, Haskell's standard numeric type classes and the
 conversion functions between them play a central role in the API
 of Data.Time. But you wouldn't realize that unless you have read
 the type signatures and instance lists in the Haddocks very
 carefully, and have thought about it for a while.
 
 This is exactly right.
 
 
 Another problem, as Malcolm pointed out, is that because of the
 sheer size of the library, a quick-start guide for the common
 cases would be extremely helpful for newcomers.
 
 That would be very, very helpful.  I had a few working examples things were 
 much better.  Finding a starting place, any starting place, proved to be 
 quite elusive.  Also the fact that asking for the current time traps you in 
 IO hell, doesn't help, although it's clear that it should be that way.
 
 Brian
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe