On Thu, May 27, 2010 at 12:15:56 +0000, Petr Ročkai wrote: > Thu May 27 14:10:19 CEST 2010 Petr Rockai <[email protected]> > * Resolve issue1763: use correct filename encoding in conflictors.
OK, we've had two people (Reinier, Eric) look at this and OK it, so I guess it
makes sense for me to push it now with some thoughts about future work.
Resolve issue1763: use correct filename encoding in conflictors.
----------------------------------------------------------------
> hunk ./src/Darcs/Patch/Real.hs 716
> blueText "conflictor" <+> showNons i <+> blueText "[]" $$ showNon p
> showPatch (Conflictor i cs p) =
> blueText "conflictor" <+> showNons i <+> blueText "[" $$
> - showPatch cs $$
> + showPrimFL NewFormat cs $$
> blueText "]" $$
> showNon p
> showPatch (InvConflictor i NilFL p) =
>
I'm still concerned that we're not being systematic enough about really
fixing this (eg. show we worry about rotcifnoc? showNon? etc)
[The mental image I have is those old cartoons where you have the
character on a boat and a leak forms, so he plugs it with a finger,
and then another leak, and another finger, and another leak...]
I also notice this:
instance ReadPatch Prim where
readPatch' _ = readPrim OldFormat
-- this and other darcs-2 format patches use readPrim NewFormat
readNons :: (ReadPatch p, ParserM m) => m [Non p C(x)]
readNons = peekfor "{{" rns (return [])
where rns = peekfor "}}" (return []) $
do Just (Sealed ps) <- readPatch' False
lexChar ':'
Just (Sealed p) <- readPrim NewFormat
(Non ps p :) `liftM` rns
and in the read code for Non and RealPatch (I think these are darcs-2 style
patches), readPatch eventually uses readPrim NewFormat. So that makes sense:
the double-encoding comes from reading UTF-8 bytes [this is where Petr's
assertion that "the filepath *is never decoded*" makes sense] as code-points,
and then trying to encode those code-points into UTF-8 bytes.
Plan for future work? (Prim FileNameFormat)
-------------------------------------------
How does this plan sound: introduce two new wrapper types OldFormatPrim and
NewFormatPrim whose read/show instances use OldFormat/NewFormat
respectively, thus ensuring that readPatch and showPatch automagically
do the right thing?
(or even one parametrisable type (although I imagine that involves turning
on some extension for instances))
Plan for future work? (two kinds of read/show)
----------------------------------------------
Complementary plan: we should distinguish between decoding/encoding
filepaths from the operating system, and decoding/encoding filepaths
to patch files and patch bundles.
Basically the picture looks like this:
OS <--> darcs <---> patch files
The reason why I initially thought that NewFormat was a step backwards
was that I was thinking about the darcs <--> patch files part. IMHO,
what you want is for darcs <--> patch files to always use UTF-8. On the
other hand, the OS <--> darcs part needs some more thought.
This is a little half-baked right, but maybe somebody else can run with
the idea?
--
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
signature.asc
Description: Digital signature
_______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
