Hi Adolfo,

OK! Only more more iteration with me.  In my opinion, what we should
after your next revision (sorry! it's mostly nitpicks, and feel free to
ignore me if I'm saying things which are evidently stupid or just
subjective), we should send your bundle to Somebody Else to review for
correctness.  I'm bound to have missed the forest for the trees
somewhere.

We've seen most of this code before, so I'll only comment on what I
think are the new bits (or any old bits that may need modifying still)

Resolve issue 1599: automatically expire unused caches
------------------------------------------------------
> -       sfuc cache stickItHere
> +       badSource <- isBadSource
> +       let cacheFiltered = filter (\ cacheEntry -> not . badSource $ 
> cacheSource cacheEntry ) cache
> +       sfuc cacheFiltered stickItHere

FIXME: This is one place that point-free style could produce nicer code

> -          sfuc (c:cs) out | not $ writable c =
>
> -                          | otherwise = sfuc cs out

> +          sfuc (c:cs) out
> +            | not (writable c) =
...

> +            | otherwise = sfuc cs out

I won't really grumble too much here about the irrelevant change
since it seems to be partly a consequence of your other work.
Improving code style is a good thing; I just try to encourage it
in separate patches when I can.

> -   then speculateFileOrUrl (fn c) out
> -   else copyFileOrUrl [] (fn c) out Cachable
> + then speculateFileOrUrl (fn c) out `catchNonSignal` (\e -> 
> checkCacheReachability (show e) c)
> + else copyFileOrUrl [] (fn c) out Cachable `catchNonSignal` (\e -> 
> checkCacheReachability (show e) c)

[note, whitespace chopped for review]

FIXME: Please amend out the conflict with Petr's DefaultDarcsRepo
change.

Old news on catching the copy/speculate failure

> +-- | Checks if a given cache entry is reachable or not.
> +-- It receives an error caught during execution and the cache entry.

> +-- For a local cache, if the given source doesn't exist anymore, it is added.

> +-- For HTTP sources if the error is timeout, it is added, if not we check 
> for the
> +-- _darcs/hashed_inventory file, if doesn't exist it means we are pointing 
> to a repository
> +-- which used to exist there, but had been moved.
>
> +-- For SSH if we get an error we try to get the file _darcs/hashed_inventory 
> again, if it fails
> +-- we add the entry to the list of sources which are no reachables.
> +-- The entries which get added to the cache are no longer tried for the rest 
> of the command.

FIXME: The haddock above is a bit ambiguous.  Added where?  You could
probably say blacklisted and/or whitelisted.  Also, making each of these
into bullet points could make it a bit easier to read.

> +checkCacheReachability :: String -> CacheLoc -> IO ()
> +checkCacheReachability e cache


> + | isUrl source = do
> +     reachable <- isReachableSource
> +     unless (reachable source) $ do
> +            let string = case dropWhile (/='(') e of
> +                          (_:xs) -> fst (break (==')') xs)
> +                          _      -> e
> +            let cerror = case reads string ::[(HTTP.ConnectionError,String)] 
> of
> +                           [(ce,_)] -> Just ce
> +                           _        -> Nothing
> +            if isJust cerror
> +             then addBadSource source
> +              else do
> +               checkHashedInventoryReachability cache
> +               addReachableSource source
> +              `catchNonSignal` (\_ -> addBadSource source)

...

> +-- | Checks if the  _darcs/hashed_inventory exist and is reachable
> +checkHashedInventoryReachability :: CacheLoc -> IO ()
> +checkHashedInventoryReachability cache =
> +     withTemp $ \tempout -> do
> +       let f = cacheSource cache ++ "/" ++darcsdir ++ "/" ++ 
> "hashed_inventory"
> +       copyFileOrUrl [] f tempout Cachable


FIXME: I see you've done a bit of refactoring.  FIXME I think you could
make it a bit better.  How about returning IO Bool instead?  The idea is
that you'd move the exception handling into here and the code above
could just test a conditional.

Code clarity is the number one thing I tend to go after (unfortunately
this does not necessarily mean I'm any good at it).  It's why I seem to
be quite insistent on minor details like that; I think they pay off in
the long run especially when working in a team.

> -       ffuc cache
> -    `catchall` debugFail ("Couldn't fetch `"++f++"'\nin subdir "++(hashedDir 
> subdir)++
> +       badSource <- isBadSource
> +       let cacheFiltered = filter (\ cacheEntry -> not . badSource $ 
> cacheSource cacheEntry ) cache
> +       ffuc cacheFiltered
> +    `catchall` debugFail ("Couldn't fetch `"++f++"'\nin subdir "++ hashedDir 
> subdir ++
>                            " from sources:\n\n"++show (Ca cache))

FIXME: sounds like you could do another pointfree refactor here

By the way, notice how much a little higher-order function like filter
lets us do without changing very much code?  Just compose some stuff
together, feed it to filter and you're done.

> -              `catchall` ffuc cs
> +              `catchNonSignal` (\e -> do
> +                                      checkCacheReachability (show e) c
> +                                      badSource <- isBadSource
> +                                      let cacheFiltered = filter (\ 
> cacheEntry -> not . badSource $ cacheSource cacheEntry ) cs
> +                                      ffuc cacheFiltered)


When I complained about catchall and catchNonSignal you pointed out that
  x `catchall` y
is actually just
  x `catchNonSignal` const y
As an aside, this kinda makes me think that maybe catchall isn't
/really/ needed, but I could be wrong.

FIXME: you're using filter (not . badSource . cacheSource) quite a lot.

Maybe what you want is something like this

  filterOutBadSources :: [Cache] -> IO [Cache]
  filterOutBadSources cs = do
    badSource <- isBadSource
    return $ filter (not . badSource . cacheSource) cs

HTTP module
~~~~~~~~~~~
> -waitNextUrl' = do
> -  e <- curl_wait_next_url >>= peekCString
> -  u <- curl_last_url >>= peekCString
> -  return (u, e)
> +waitNextUrl' =
> +  alloca $ \ errorPointer -> do
> +    e <- curl_wait_next_url errorPointer >>= peekCString
> +    ce <- if not (null e)
> +          then do
> +           errorNum <- peek errorPointer
> +           case errorNum of
> +             6  -> return $ Just CouldNotResolveHost
> +             7  -> return $ Just CouldNotConnectToServer
> +             29 -> return $ Just OperationTimeout
> +             _  -> return Nothing
> +          else
> +           return Nothing
> +    u <- curl_last_url >>= peekCString
> +    return (u, e, ce)

FIXME: You should have a comment pointing us to some documentation
on the error codes.

As a chatty aside, I like bracket functions like alloca which
prevent forgetful people like me from introducing random memory
leaks for want of a free and also avoid bugs like me accidentally
freeing something I shouldn't.  I think we have similar idioms in
Darcs.Lock.  Sort of an inane comment on my part :-)

>  foreign import ccall "hscurl.h curl_wait_next_url"
> -  curl_wait_next_url :: IO CString
> +  curl_wait_next_url :: Ptr CInt -> IO CString

Thanks for teaching me a little more FFI

>  main = withAtexit $ withSignalsHandled $
>    flip catch execExceptionHandler $
>    handle (\(AssertionFailed e) -> bug e) $ do
> +  atexit reportBadSources

I think this is much cleaner than modifying Darcs.Global.

> +    reportBadSources = do
> +        sources <- getBadSourcesList
> +        when ( not $ null sources ) $ do

FIXME: Clearer as unless (null sources), I think

> +           hPutStderr $ "\nI could not reach the following " ++
> +             englishNum (length sources)  (Noun "repository") ":"
> +           hPutStderr $ (unlines sources) ++  "If you're not using " ++
> +             englishNum (length sources) It ", you should probably 
> delete\nthe corresponding " ++
> +             englishNum (length sources) (Noun "entry") " from 
> _darcs/prefs/sources."

Not sure if this really belongs in main, maybe in one of the Cache
modules? I'm not the best person to judge.

FIXME: Maybe worth rewriting this using only one hPutStrLn stderr
plus some string concatenation.  Concat may also be helpful here.
Basically, just try to make the code as easy to read as possible.
It's a bit subjective though, but I bet there are things you can
do which are objectively nicer.

> +    hPutStderr = hPutStrLn stderr

FIXME: Ahah, for all my carrying on about refactoring, I actually think
this is one of those nuanced cases where the code duplication of just
saying hPutStrLn stderr is clearer than asking people to keep track of
an extra name, particularly people who have to maintain the code.
Subtle, huh?! You duplicate code and Eric complains.  You refactor code
and then he complains some more! Can't win...

On the other hand, maybe I'd refactor the length sources if I were
you.


>        int error = curl_easy_getinfo(easy, CURLINFO_PRIVATE, (char 
> **)&url_data);
> -      if (error != CURLE_OK)
> +      if (error != CURLE_OK){

The if (foo) bar; style always seems to me like a clarity error
wrt if (foo) { bar; }... Good thing your change necessitates getting
rid of it.

> +        *errorCode = error;
> +      }

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, please try +44 (0)1273 64 2905.

Attachment: pgpF5eskzHqma.pgp
Description: PGP signature

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to