Ok, thanks!
I could change my parser combinator to look like this:
trait DateParsers extends RegexParsers {
def dateTime(pattern: String) = new Parser[LocalDate] {
val dateFormat = DateTimeFormatters.pattern(pattern)
def jodaParse(text: CharSequence, offset: Int) = {
val parsePosition = new ParsePosition(offset)
val result = dateFormat.parse(text, parsePosition)
val date = () => result.toCalendricalMerger().getDate(false)
(date, parsePosition)
}
def apply(in: Input) = {
val source = in.source
val offset = in.offset
val start = handleWhiteSpace(source, offset)
val (date, parsePosition) = jodaParse(source, start)
if (parsePosition.getErrorIndex >= 0)
Failure("Failed to parse date", in.drop(start - offset))
else
Success(date(), in.drop(parsePosition.getIndex - offset))
}
}
}
to get a LocalDate. I'll look into ThreeTen a bit more to see what
I'll use exactly, but anyways it seems to work - my enormous bulk of
tests from that old project passes :) :
class DateParserSpec extends FlatSpec with ShouldMatchers with DateParsers {
"A DateParser" should "fail on weird output" in {
val result = parseAll(dateTime("EEE MMM d HH:mm:ss"), " xxx")
result.successful should be(false)
}
it should "run fine on a matching date" in {
val result = parseAll(dateTime("EEE MMM d HH:mm:ss yyyy Z"), "Wed
Aug 12 18:49:56 2009 +0200")
result.successful should be(true)
}
it should "run fine on another matching date" in {
val result = parseAll(dateTime("HH:ss:mm yyyy MMM"), " 22:56:23 2010
Jan ")
result.successful should be (true)
}
}
I still really think that the call to toString() is a problem. Sure a
CharSequence is a random access datastructure but it is still very
usable from a streaming point of view as long as you don't look ahead
to much. This is exactly how the scala combinator parsers work. Look
at for instance PagedSeq:
http://www.scala-lang.org/archives/downloads/distrib/files/nightly/docs/library/scala/collection/immutable/PagedSeq$.html
This is a lazily evaluated sequence a that stores the elements in
pages of fixed length arrays. And I can do:
val input = PagedSeqReader(PagedSeq.fromFile(new
File("my/path/file.txt")) and use this lazy character stream straight
away in my combinator parser.
The toString method turns it to eager:
/** Convert sequence to string */
override def toString = {
val buf = new StringBuilder
for (ch <- PagedSeq.this.iterator) buf append ch
buf.toString
}
which would make my date parsing-wrapper unusable from say a stream
over a network or a really large file.
I've looked at it again and to me it doesn't seem too hard to fix:
https://github.com/hedefalk/threeten/commit/517c1bcc6d7c4982f90a41781506d2616e9772f4
- tests pass.
But then again I might be biased since I think this one is very important ;)
The biggest issue is regionMatches i guess and I had to introduce a
Util-class again. If I were to issue a pull request, what would be
your preferred way of handling that one?
Thanks,
Viktor
On Mon, Jul 11, 2011 at 8:53 PM, Stephen Colebourne
<[email protected]> wrote:
> On 11 July 2011 19:21, Viktor Hedefalk <[email protected]> wrote:
>> I guess that the method that could be possible to use in ThreeTen
>> would be this one?
>> public DateTimeParseContext parse(CharSequence text, ParsePosition
>> position) {
>
> If you need the ParsePosition, then that is the one.
>
>> This line hurts
>>
>> // parse a String as its a better API for parser writers
>> String str = text.toString();
>>
>> since it will be the entire input I'm parsing but I guess it probably
>> works in practice in my case, I'll have to try it out and get back.
>
> Thats the current choice I'm making, CharSequence outside, String inside.
>
>> Just of curiosity, what is it in String that makes it easier for parser
>> writers?
>
> Its just a bigger API, with startsWith, contains, indexOf,
> regionMatches ... I tried to convert it to CharSeq internally, but it
> seemed like more hassle than it was worth. If you can convince me its
> really a major hassle or performance issue, then I might accept a pull
> request, but I'd prefer not to if possible.
>
> Stephen
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Joda-interest mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/joda-interest
>
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Joda-interest mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/joda-interest