Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package ghc-citeproc for openSUSE:Factory checked in at 2021-06-01 10:38:48 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/ghc-citeproc (Old) and /work/SRC/openSUSE:Factory/.ghc-citeproc.new.1898 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "ghc-citeproc" Tue Jun 1 10:38:48 2021 rev:11 rq:896187 version:0.4 Changes: -------- --- /work/SRC/openSUSE:Factory/ghc-citeproc/ghc-citeproc.changes 2021-03-24 16:11:54.379873160 +0100 +++ /work/SRC/openSUSE:Factory/.ghc-citeproc.new.1898/ghc-citeproc.changes 2021-06-01 10:40:26.309117123 +0200 @@ -1,0 +2,25 @@ +Thu May 13 08:26:54 UTC 2021 - [email protected] + +- Update citeproc to version 0.4. + ## 0.4 + + * We now use Lang from unicode-collation rather than defining our own. + The type constructor has changed, as has the signature of + parseLang. + * Use unicode-collation by default for more accurate sorting. + - text-icu will still be used if the icu flag is set. This may + give better performance, at the cost of depending on a large + C library. + - Change type of SortKeyValue so it doesn't embed Lang. [API change] + Instead, we now store a language-specific collator in the Eval Context. + - Move compSortKeyValues from Types to Eval. + * Add curly open quote to word splitters in normalizeSortKey. + * Improve date sorting: use the format YYYY0000 if no month, day, + and YYYYMM00 if no day when generating sort keys. + * Special treatment of literal "others" as last name in a list (#61). + When we convert bibtex/biblatex bibliographies, the form "and others" + yields a last name with nameLiteral = "others". We detect this and + generate a localized "and others" (et al). + * Make abbreviations case-insensitive (#45). + +------------------------------------------------------------------- Old: ---- citeproc-0.3.0.9.tar.gz New: ---- citeproc-0.4.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ ghc-citeproc.spec ++++++ --- /var/tmp/diff_new_pack.a2Qi1e/_old 2021-06-01 10:40:26.809117975 +0200 +++ /var/tmp/diff_new_pack.a2Qi1e/_new 2021-06-01 10:40:26.813117981 +0200 @@ -19,7 +19,7 @@ %global pkg_name citeproc %bcond_with tests Name: ghc-%{pkg_name} -Version: 0.3.0.9 +Version: 0.4 Release: 0 Summary: Generates citations and bibliography from CSL styles License: BSD-2-Clause @@ -35,12 +35,12 @@ BuildRequires: ghc-file-embed-devel BuildRequires: ghc-filepath-devel BuildRequires: ghc-pandoc-types-devel -BuildRequires: ghc-rfc5051-devel BuildRequires: ghc-rpm-macros BuildRequires: ghc-safe-devel BuildRequires: ghc-scientific-devel BuildRequires: ghc-text-devel BuildRequires: ghc-transformers-devel +BuildRequires: ghc-unicode-collation-devel BuildRequires: ghc-uniplate-devel BuildRequires: ghc-vector-devel BuildRequires: ghc-xml-conduit-devel ++++++ citeproc-0.3.0.9.tar.gz -> citeproc-0.4.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/CHANGELOG.md new/citeproc-0.4/CHANGELOG.md --- old/citeproc-0.3.0.9/CHANGELOG.md 2021-03-13 19:43:47.000000000 +0100 +++ new/citeproc-0.4/CHANGELOG.md 2021-04-19 07:17:04.000000000 +0200 @@ -1,5 +1,26 @@ # citeproc changelog +## 0.4 + + * We now use Lang from unicode-collation rather than defining our own. + The type constructor has changed, as has the signature of + parseLang. + * Use unicode-collation by default for more accurate sorting. + - text-icu will still be used if the icu flag is set. This may + give better performance, at the cost of depending on a large + C library. + - Change type of SortKeyValue so it doesn't embed Lang. [API change] + Instead, we now store a language-specific collator in the Eval Context. + - Move compSortKeyValues from Types to Eval. + * Add curly open quote to word splitters in normalizeSortKey. + * Improve date sorting: use the format YYYY0000 if no month, day, + and YYYYMM00 if no day when generating sort keys. + * Special treatment of literal "others" as last name in a list (#61). + When we convert bibtex/biblatex bibliographies, the form "and others" + yields a last name with nameLiteral = "others". We detect this and + generate a localized "and others" (et al). + * Make abbreviations case-insensitive (#45). + ## 0.3.0.9 * Implement `et-al-subsequent-min` and `et-al-subsequent-use-first` (#60). diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/app/Main.hs new/citeproc-0.4/app/Main.hs --- old/citeproc-0.3.0.9/app/Main.hs 2020-12-28 19:27:47.000000000 +0100 +++ new/citeproc-0.4/app/Main.hs 2021-04-18 00:51:34.000000000 +0200 @@ -4,7 +4,7 @@ module Main where import Citeproc import Citeproc.CslJson -import Control.Monad (when, unless) +import Control.Monad (when, unless, foldM) import Control.Applicative ((<|>)) import Data.Bifunctor (second) import Data.Maybe (fromMaybe) @@ -27,7 +27,7 @@ unless (null errs) $ do mapM_ err errs exitWith $ ExitFailure 1 - let opt = foldr ($) defaultOpt opts + opt <- foldM (flip ($)) defaultOpt opts when (optHelp opt) $ do putStr $ usageInfo "citeproc [OPTIONS] [FILE]" options exitSuccess @@ -128,29 +128,32 @@ , optVersion = False } -options :: [OptDescr (Opt -> Opt)] +options :: [OptDescr (Opt -> IO Opt)] options = [ Option ['s'] ["style"] - (ReqArg (\fp opt -> opt{ optStyle = Just fp }) "FILE") + (ReqArg (\fp opt -> return opt{ optStyle = Just fp }) "FILE") "CSL style file" , Option ['r'] ["references"] - (ReqArg (\fp opt -> opt{ optReferences = Just fp }) "FILE") + (ReqArg (\fp opt -> return opt{ optReferences = Just fp }) "FILE") "CSL JSON bibliography" , Option ['a'] ["abbreviations"] - (ReqArg (\fp opt -> opt{ optAbbreviations = Just fp }) "FILE") + (ReqArg (\fp opt -> return opt{ optAbbreviations = Just fp }) "FILE") "CSL abbreviations table" , Option ['l'] ["lang"] - (ReqArg (\lang opt -> opt{ optLang = Just $ parseLang $ T.pack lang }) - "IETF language") + (ReqArg (\lang opt -> + case parseLang (T.pack lang) of + Right l -> return opt{ optLang = Just l } + Left msg -> err $ "Could not parse language tag:\n" ++ msg) + "BCP 47 language tag") "Override locale" , Option ['f'] ["format"] - (ReqArg (\format opt -> opt{ optFormat = Just format }) "html|json") + (ReqArg (\format opt -> return opt{ optFormat = Just format }) "html|json") "Controls formatting of entries in result" , Option ['h'] ["help"] - (NoArg (\opt -> opt{ optHelp = True })) + (NoArg (\opt -> return opt{ optHelp = True })) "Print usage information" , Option ['V'] ["version"] - (NoArg (\opt -> opt{ optVersion = True })) + (NoArg (\opt -> return opt{ optVersion = True })) "Print version number" ] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/cabal.project new/citeproc-0.4/cabal.project --- old/citeproc-0.3.0.9/cabal.project 2020-11-09 19:07:54.000000000 +0100 +++ new/citeproc-0.4/cabal.project 2021-04-18 00:24:57.000000000 +0200 @@ -2,3 +2,4 @@ package citeproc flags: -icu +executable + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/citeproc.cabal new/citeproc-0.4/citeproc.cabal --- old/citeproc-0.3.0.9/citeproc.cabal 2021-03-13 19:43:09.000000000 +0100 +++ new/citeproc-0.4/citeproc.cabal 2021-04-26 06:44:08.000000000 +0200 @@ -1,6 +1,6 @@ cabal-version: 2.2 name: citeproc -version: 0.3.0.9 +version: 0.4 synopsis: Generates citations and bibliography from CSL styles. description: citeproc parses CSL style files and uses them to generate a list of formatted citations and bibliography @@ -79,9 +79,9 @@ , pandoc-types >= 1.22 && < 1.23 -- , pretty-show if flag(icu) - build-depends: text-icu + build-depends: text-icu else - build-depends: rfc5051 >= 0.2 && < 0.3 + build-depends: unicode-collation >= 0.1.3 && < 0.2 ghc-options: -Wall -Wincomplete-record-updates diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/man/citeproc.1 new/citeproc-0.4/man/citeproc.1 --- old/citeproc-0.3.0.9/man/citeproc.1 2021-03-13 19:44:15.000000000 +0100 +++ new/citeproc-0.4/man/citeproc.1 2021-04-19 07:20:15.000000000 +0200 @@ -1,6 +1,6 @@ -.\" Automatically generated by Pandoc 2.12 +.\" Automatically generated by Pandoc 2.13 .\" -.TH "citeproc" "1" "" "citeproc 0.3.0.9" "" +.TH "citeproc" "1" "" "citeproc 0.4" "" .hy .SH NAME .PP @@ -27,8 +27,25 @@ Specify a CSL abbreviations file. .TP \f[B]\f[CB]l\f[B]\f[R] \f[I]LANG\f[R], \f[B]\f[CB]--lang=\f[B]\f[R]\f[I]LANG\f[R] -Specify a locale to override the style\[cq]s default (IETF language -code). +Specify a locale to override the style\[cq]s default. +A BCP 47 language tag is expected: for example, \f[C]en\f[R], +\f[C]de\f[R], \f[C]en-US\f[R], \f[C]fr-CA\f[R], \f[C]ug-Cyrl\f[R]. +The unicode extension syntax (after \f[C]-u-\f[R]) may be used to +specify options for collation. +Here are some examples: +.RS +.IP \[bu] 2 +\f[C]zh-u-co-pinyin\f[R] \[en] Chinese with the Pinyin collation. +.IP \[bu] 2 +\f[C]es-u-co-trad\f[R] \[en] Spanish with the traditional collation +(with \f[C]Ch\f[R] sorting after \f[C]C\f[R]). +.IP \[bu] 2 +\f[C]fr-u-kb\f[R] \[en] French with \[lq]backwards\[rq] accent sorting +(with \f[C]cot\['e]\f[R] sorting after \f[C]c\[^o]te\f[R]). +.IP \[bu] 2 +\f[C]en-US-u-kf-upper\f[R] \[en] English with uppercase letters sorting +before lower (default is lower before upper). +.RE .TP \f[B]\f[CB]f\f[B]\f[R] \f[I]html|json\f[R], \f[B]\f[CB]--format=\f[B]\f[R]\f[I]html|json\f[R] Specify the format to be used for the entries. diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/man/citeproc.1.md new/citeproc-0.4/man/citeproc.1.md --- old/citeproc-0.3.0.9/man/citeproc.1.md 2020-10-04 17:43:11.000000000 +0200 +++ new/citeproc-0.4/man/citeproc.1.md 2021-04-19 07:20:06.000000000 +0200 @@ -31,8 +31,19 @@ : Specify a CSL abbreviations file. `l` *LANG*, `--lang=`*LANG* -: Specify a locale to override the style's default (IETF - language code). +: Specify a locale to override the style's default. + A BCP 47 language tag is expected: for example, `en`, + `de`, `en-US`, `fr-CA`, `ug-Cyrl`. The unicode extension + syntax (after `-u-`) may be used to specify options for + collation. Here are some examples: + + - `zh-u-co-pinyin` -- Chinese with the Pinyin collation. + - `es-u-co-trad` -- Spanish with the traditional collation + (with `Ch` sorting after `C`). + - `fr-u-kb` -- French with "backwards" accent sorting + (with `cot??` sorting after `c??te`). + - `en-US-u-kf-upper` -- English with uppercase letters sorting + before lower (default is lower before upper). `f` *html|json*, `--format=`*html|json* : Specify the format to be used for the entries. `html` (the diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Element.hs new/citeproc-0.4/src/Citeproc/Element.hs --- old/citeproc-0.3.0.9/src/Citeproc/Element.hs 2020-12-09 06:12:12.000000000 +0100 +++ new/citeproc-0.4/src/Citeproc/Element.hs 2021-04-18 00:24:57.000000000 +0200 @@ -124,7 +124,9 @@ pLocale :: X.Element -> ElementParser Locale pLocale node = do let attr = getAttributes node - let lang = parseLang <$> lookupAttribute "lang" attr + lang <- case lookupAttribute "lang" attr of + Nothing -> return Nothing + Just l -> either parseFailure (return . Just) $ parseLang l let styleOpts = mconcat . map getAttributes $ getChildren "style-options" node let addDateElt e m = diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Eval.hs new/citeproc-0.4/src/Citeproc/Eval.hs --- old/citeproc-0.3.0.9/src/Citeproc/Eval.hs 2021-03-13 19:39:10.000000000 +0100 +++ new/citeproc-0.4/src/Citeproc/Eval.hs 2021-04-18 00:24:57.000000000 +0200 @@ -18,17 +18,22 @@ import qualified Data.Map as M import qualified Data.Set as Set import Data.Coerce (coerce) -import Data.List (find, intersperse, sortOn, groupBy, foldl', transpose, +import Data.List (find, intersperse, sortBy, sortOn, groupBy, foldl', transpose, sort, (\\)) import Data.Text (Text) import qualified Data.Text as T -import Data.Char (isSpace, isPunctuation, isDigit, isUpper, isLower, isLetter, +import Data.Char (isSpace, isDigit, isUpper, isLower, isLetter, ord, chr) import Text.Printf (printf) import Control.Applicative import Data.Generics.Uniplate.Operations (universe, transform) --- import Debug.Trace (traceShowId) +-- import Debug.Trace (trace) + +-- traceShowIdLabeled :: Show a => String -> a -> a +-- traceShowIdLabeled label x = +-- trace (label ++ ": " ++ show x) x + -- import Text.Show.Pretty (ppShow) -- ppTrace :: Show a => a -> a -- ppTrace x = trace (ppShow x) x @@ -36,6 +41,7 @@ data Context a = Context { contextLocale :: Locale + , contextCollate :: [SortKeyValue] -> [SortKeyValue] -> Ordering , contextAbbreviations :: Maybe Abbreviations , contextStyleOptions :: StyleOptions , contextLocator :: Maybe Text @@ -46,7 +52,6 @@ , contextInBibliography :: Bool , contextSubstituteNamesForm :: Maybe NamesFormat } - deriving (Show) -- used internally for group elements, which -- are skipped if (a) the group calls a variable @@ -95,6 +100,9 @@ ((citationOs, bibliographyOs), warnings) = evalRWS go Context { contextLocale = mergeLocales mblang style + , contextCollate = \xs ys -> + compSortKeyValues (Unicode.comp mblang) + xs ys , contextAbbreviations = styleAbbreviations style , contextStyleOptions = styleOptions style , contextLocator = Nothing @@ -143,6 +151,8 @@ (map referenceId refs) assignCitationNumbers sortedCiteIds -- sorting of bibliography, insertion of citation-number + collate <- asks contextCollate + (bibCitations, bibSortKeyMap) <- case styleBibliography style of Nothing -> return ([], mempty) @@ -156,7 +166,10 @@ let sortedIds = if null (layoutSortKeys biblayout) then sortedCiteIds - else sortOn (`M.lookup` bibSortKeyMap) + else sortBy + (\x y -> collate + (fromMaybe [] $ M.lookup x bibSortKeyMap) + (fromMaybe [] $ M.lookup y bibSortKeyMap)) (map referenceId refs) assignCitationNumbers $ case layoutSortKeys biblayout of @@ -193,10 +206,13 @@ let sortCitationItems citation' = citation'{ citationItems = concatMap - (sortOn - (\citeItem -> - M.lookup (citationItemId citeItem) - sortKeyMap)) + (sortBy + (\item1 item2 -> + collate + (fromMaybe [] $ M.lookup + (citationItemId item1) sortKeyMap) + (fromMaybe [] $ M.lookup + (citationItemId item2) sortKeyMap))) $ groupBy canGroup $ citationItems citation' } let citCitations = map sortCitationItems citations @@ -598,9 +614,13 @@ -> Eval a () addYearSuffixes bibSortKeyMap' as = do let allitems = concat as + collate <- asks contextCollate let companions a = - sortOn - (\it -> M.lookup (ddItem it) bibSortKeyMap') + sortBy + (\item1 item2 -> + collate + (fromMaybe [] $ M.lookup (ddItem item1) bibSortKeyMap') + (fromMaybe [] $ M.lookup (ddItem item2) bibSortKeyMap')) (concat [ x | x <- as, a `elem` x ]) let groups = Set.map companions $ Set.fromList allitems let addYearSuffix item suff = @@ -932,22 +952,20 @@ -> SortKey a -> Eval a SortKeyValue evalSortKey citeId (SortKeyMacro sortdir elts) = do - mblang <- asks (localeLanguage . contextLocale) refmap <- gets stateRefMap case lookupReference citeId refmap of - Nothing -> return $ SortKeyValue sortdir mblang Nothing + Nothing -> return $ SortKeyValue sortdir Nothing Just ref -> do k <- normalizeSortKey . toText . renderOutput defaultCiteprocOptions . grouped <$> withRWS newContext (mconcat <$> mapM eElement elts) - return $ SortKeyValue sortdir mblang (Just k) + return $ SortKeyValue sortdir (Just k) where newContext oldContext s = (oldContext, s{ stateReference = ref }) evalSortKey citeId (SortKeyVariable sortdir var) = do - mblang <- asks (localeLanguage . contextLocale) refmap <- gets stateRefMap - SortKeyValue sortdir mblang <$> + SortKeyValue sortdir <$> case lookupReference citeId refmap >>= lookupVariable var of Nothing -> return Nothing Just (TextVal t) -> return $ Just $ normalizeSortKey t @@ -959,29 +977,67 @@ Just (DateVal d) -> return $ Just [T.toLower $ dateToText d] normalizeSortKey :: Text -> [Text] -normalizeSortKey = - filter (not . T.null) . - T.words . - T.map (\c -> if isPunctuation c || - c == '??' || c == '??' -- ayn/hamza in transliterated arabic - then ' ' - else c) . - T.filter (/= '-') +normalizeSortKey = filter (not . T.null) . T.split isWordSep + where + isWordSep c = isSpace c || c == '\'' || c == '???' || c == ',' || + c == '??' || c == '??' -- ayn/hamza in transliterated arabic --- Note! This prints negative (BC) dates as -(999,999,999 + y) --- so they sort properly. Do not use out of context of sort keys. +-- absence should sort AFTER all values +-- see sort_StatusFieldAscending.txt, sort_StatusFieldDescending.txt +compSortKeyValue :: (Text -> Text -> Ordering) + -> SortKeyValue + -> SortKeyValue + -> Ordering +compSortKeyValue collate sk1 sk2 = + case (sk1, sk2) of + (SortKeyValue _ Nothing, SortKeyValue _ Nothing) -> EQ + (SortKeyValue _ Nothing, SortKeyValue _ (Just _)) -> GT + (SortKeyValue _ (Just _), SortKeyValue _ Nothing) -> LT + (SortKeyValue Ascending (Just t1), SortKeyValue Ascending (Just t2)) -> + collateKey t1 t2 + (SortKeyValue Descending (Just t1), SortKeyValue Descending (Just t2))-> + collateKey t2 t1 + _ -> EQ + where + collateKey :: [Text] -> [Text] -> Ordering + collateKey [] [] = EQ + collateKey [] (_:_) = LT + collateKey (_:_) [] = GT + collateKey (x:xs) (y:ys) = + case collate x y of + EQ -> collateKey xs ys + GT -> GT + LT -> LT + +compSortKeyValues :: (Text -> Text -> Ordering) + -> [SortKeyValue] + -> [SortKeyValue] + -> Ordering +compSortKeyValues _ [] [] = EQ +compSortKeyValues _ [] (_:_) = LT +compSortKeyValues _ (_:_) [] = GT +compSortKeyValues collate (x:xs) (y:ys) = + case compSortKeyValue collate x y of + EQ -> compSortKeyValues collate xs ys + GT -> GT + LT -> LT + +-- Note! This prints negative (BC) dates as N(999,999,999 + y) +-- and positive (AD) dates as Py so they sort properly. (Note that +-- our unicode sorting ignores punctuation, so we use a letter +-- rather than -.) Do not use out of context of sort keys. dateToText :: Date -> Text dateToText = mconcat . map (T.pack . go . coerce) . dateParts where go :: [Int] -> String go [] = "" - go [y] = toYear y - go [y,m] = toYear y <> printf "%02d" m + go [y] = toYear y <> "0000" + go [y,m] = toYear y <> printf "%02d" m <> "00" go (y:m:d:_) = toYear y <> printf "%02d" m <> printf "%02d" d toYear :: Int -> String toYear y - | y < 0 = printf "-%09d" (999999999 + y) - | otherwise = printf "0%09d" y + | y < 0 = printf "N%09d" (999999999 + y) + | otherwise = printf "P%09d" y evalLayout :: CiteprocOutput a @@ -1088,9 +1144,9 @@ st{ stateReference = ref , stateUsedYearSuffix = False })) $ do xs <- mconcat <$> mapM eElement (layoutElements layout) - let mblang = parseLang <$> - (lookupVariable "language" ref - >>= valToText) + let mblang = lookupVariable "language" ref + >>= valToText + >>= either (const Nothing) Just . parseLang return $ case mblang of Nothing -> xs @@ -1240,8 +1296,10 @@ lang <- asks (localeLanguage . contextLocale) ref <- gets stateReference let reflang = case M.lookup "language" (referenceVariables ref) of - Just (TextVal t) -> Just $ parseLang t - Just (FancyVal x) -> Just $ parseLang $ toText x + Just (TextVal t) -> + either (const Nothing) Just $ parseLang t + Just (FancyVal x) -> + either (const Nothing) Just $ parseLang $ toText x _ -> Nothing let mainLangIsEn Nothing = False mainLangIsEn (Just l) = langLanguage l == "en" @@ -1994,10 +2052,17 @@ | otherwise -> Formatted mempty{ formatPrefix = Just beforeEtAl } . (:[]) <$> lookupTerm' emptyTerm{ termName = "et-al" } + let finalNameIsOthers = (lastMay names >>= nameLiteral) == Just "others" + -- bibtex conversions often have this, and we want to render it "et al" let addNameAndDelim name idx | etAlThreshold == Just 0 = NullOutput | idx == 1 = name | idx == numnames + , finalNameIsOthers = + if inSortKey + then NullOutput + else etAl + | idx == numnames , etAlUseLast , maybe False (idx - 1 >=) etAlThreshold = name diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Locale.hs new/citeproc-0.4/src/Citeproc/Locale.hs --- old/citeproc-0.3.0.9/src/Citeproc/Locale.hs 2020-09-28 01:08:48.000000000 +0200 +++ new/citeproc-0.4/src/Citeproc/Locale.hs 2021-04-18 00:24:57.000000000 +0200 @@ -28,67 +28,70 @@ Left e -> Left $ CiteprocXMLError (T.pack (show e)) Right n -> runElementParser $ pLocale $ X.documentRoot n -primaryDialectMap :: M.Map Text Text +primaryDialectMap :: M.Map Text (Maybe Text) primaryDialectMap = M.fromList - [ ("af", "af-ZA"), - ("ar", "ar"), - ("bg", "bg-BG"), - ("ca", "ca-AD"), - ("cs", "cs-CZ"), - ("cy", "cy-GB"), - ("da", "da-DK"), - ("de", "de-DE"), - ("el", "el-GR"), - ("en", "en-US"), - ("es", "es-ES"), - ("et", "et-EE"), - ("eu", "eu"), - ("fa", "fa-IR"), - ("fi", "fi-FI"), - ("fr", "fr-FR"), - ("he", "he-IL"), - ("hr", "hr-HR"), - ("hu", "hu-HU"), - ("id", "id-ID"), - ("is", "is-IS"), - ("it", "it-IT"), - ("ja", "ja-JP"), - ("km", "km-KH"), - ("ko", "ko-KR"), - ("la", "la"), - ("lt", "lt-LT"), - ("lv", "lv-LV"), - ("mn", "mn-MN"), - ("nb", "nb-NO"), - ("nl", "nl-NL"), - ("nn", "nn-NO"), - ("pl", "pl-PL"), - ("pt", "pt-PT"), - ("ro", "ro-RO"), - ("ru", "ru-RU"), - ("sk", "sk-SK"), - ("sl", "sl-SI"), - ("sr", "sr-RS"), - ("sv", "sv-SE"), - ("th", "th-TH"), - ("tr", "tr-TR"), - ("uk", "uk-UA"), - ("vi", "vi-VN"), - ("zh", "zh-CN") + [ ("af", Just "ZA"), + ("ar", Nothing), + ("bg", Just "BG"), + ("ca", Just "AD"), + ("cs", Just "CZ"), + ("cy", Just "GB"), + ("da", Just "DK"), + ("de", Just "DE"), + ("el", Just "GR"), + ("en", Just "US"), + ("es", Just "ES"), + ("et", Just "EE"), + ("eu", Nothing), + ("fa", Just "IR"), + ("fi", Just "FI"), + ("fr", Just "FR"), + ("he", Just "IL"), + ("hr", Just "HR"), + ("hu", Just "HU"), + ("id", Just "ID"), + ("is", Just "IS"), + ("it", Just "IT"), + ("ja", Just "JP"), + ("km", Just "KH"), + ("ko", Just "KR"), + ("la", Nothing), + ("lt", Just "LT"), + ("lv", Just "LV"), + ("mn", Just "MN"), + ("nb", Just "NO"), + ("nl", Just "NL"), + ("nn", Just "NO"), + ("pl", Just "PL"), + ("pt", Just "PT"), + ("ro", Just "RO"), + ("ru", Just "RU"), + ("sk", Just "SK"), + ("sl", Just "SI"), + ("sr", Just "RS"), + ("sv", Just "SE"), + ("th", Just "TH"), + ("tr", Just "TR"), + ("uk", Just "UA"), + ("vi", Just "VN"), + ("zh", Just "CN") ] -- | Retrieves the "primary dialect" corresponding to a langage, -- e.g. "lt-LT" for "lt". getPrimaryDialect :: Lang -> Maybe Lang -getPrimaryDialect l = - parseLang <$> M.lookup (langLanguage l) primaryDialectMap +getPrimaryDialect lang = + case M.lookup (langLanguage lang) primaryDialectMap of + Nothing -> Nothing + Just mbregion -> Just $ lang{ langRegion = mbregion } -locales :: M.Map Lang (Either CiteprocError Locale) + +locales :: M.Map Text (Either CiteprocError Locale) locales = foldr go mempty localeFiles where go (fp, bs) m | takeExtension fp == ".xml" - = let lang = parseLang $ T.pack $ dropExtension fp + = let lang = T.pack $ dropExtension fp in M.insert lang (parseLocale $ decodeUtf8 bs) m | otherwise = m @@ -96,9 +99,10 @@ -- Implements the locale fallback algorithm described in the CSL 1.0.1 spec. getLocale :: Lang -> Either CiteprocError Locale getLocale lang = - case M.lookup lang locales - <|> - (getPrimaryDialect lang >>= \lang' -> M.lookup lang' locales) of - Just loc -> loc - Nothing -> Left $ CiteprocLocaleNotFound $ renderLang lang + let toCode l = langLanguage l <> maybe "" ("-"<>) (langRegion l) + in case M.lookup (toCode lang) locales + <|> (getPrimaryDialect lang >>= + (\l -> M.lookup (toCode l) locales)) of + Just loc -> loc + Nothing -> Left $ CiteprocLocaleNotFound $ renderLang lang diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Style.hs new/citeproc-0.4/src/Citeproc/Style.hs --- old/citeproc-0.3.0.9/src/Citeproc/Style.hs 2020-12-09 06:12:12.000000000 +0100 +++ new/citeproc-0.4/src/Citeproc/Style.hs 2021-04-18 00:24:57.000000000 +0200 @@ -28,10 +28,10 @@ mergeLocales mblang style = mconcat stylelocales <> deflocale -- left-biased union where - getUSLocale = case getLocale (Lang "en" (Just"US")) of + getUSLocale = case getLocale (Lang "en" Nothing (Just"US") [] [] []) of Right l -> l Left _ -> mempty - lang = fromMaybe (Lang "en" (Just "US")) $ + lang = fromMaybe (Lang "en" Nothing (Just"US") [] [] []) $ mblang <|> styleDefaultLocale (styleOptions style) deflocale = case getLocale lang of Right l -> l @@ -46,7 +46,7 @@ , localeLanguage l == primlang] ++ -- then match to the two letter language [l | l <- styleLocales style - , (langVariant <$> localeLanguage l) == Just Nothing + , (langRegion <$> localeLanguage l) == Just Nothing , (langLanguage <$> localeLanguage l) == Just (langLanguage lang)] ++ -- then locale with no lang @@ -71,7 +71,10 @@ Left e -> return $ Left $ CiteprocXMLError (T.pack (show e)) Right n -> do let attr = getAttributes $ X.documentRoot n - let defaultLocale = parseLang <$> lookupAttribute "default-locale" attr + let defaultLocale = + case lookupAttribute "default-locale" attr of + Nothing -> Nothing + Just l -> either (const Nothing) Just $ parseLang l let links = concatMap (getChildren "link") $ getChildren "info" (X.documentRoot n) case [getAttributes l diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Types.hs new/citeproc-0.4/src/Citeproc/Types.hs --- old/citeproc-0.3.0.9/src/Citeproc/Types.hs 2021-02-23 00:10:56.000000000 +0100 +++ new/citeproc-0.4/src/Citeproc/Types.hs 2021-04-18 00:57:56.000000000 +0200 @@ -127,9 +127,13 @@ import Safe (readMay) import Data.String (IsString) import Citeproc.Unicode (Lang(..), parseLang, renderLang) -import qualified Citeproc.Unicode as Unicode -- import Debug.Trace +-- +-- traceShowIdLabeled :: Show a => String -> a -> a +-- traceShowIdLabeled label x = +-- trace (label ++ ": " ++ show x) x + -- import Text.Show.Pretty (ppShow) -- -- ppTrace :: Show a => a -> a @@ -576,39 +580,9 @@ deriving (Show, Eq) data SortKeyValue = - SortKeyValue SortDirection (Maybe Lang) (Maybe [Text]) + SortKeyValue SortDirection (Maybe [Text]) deriving (Show, Eq) --- absence should sort AFTER all values --- see sort_StatusFieldAscending.txt, sort_StatusFieldDescending.txt -instance Ord SortKeyValue where - SortKeyValue Ascending _ _ <= SortKeyValue Ascending _ Nothing - = True - SortKeyValue Ascending _ Nothing <= SortKeyValue Ascending _ (Just _) - = False - SortKeyValue Ascending mblang (Just t1) <= - SortKeyValue Ascending _ (Just t2) = - keyLEQ mblang t1 t2 - SortKeyValue Descending _ _ <= SortKeyValue Descending _ Nothing - = True - SortKeyValue Descending _ Nothing <= SortKeyValue Descending _ (Just _) - = False - SortKeyValue Descending mblang (Just t1) <= - SortKeyValue Descending _ (Just t2) = - keyLEQ mblang t2 t1 - SortKeyValue{} <= SortKeyValue{} = False - --- We need special comparison operators to ensure that --- ?? sorts before b, for example. -keyLEQ :: Maybe Lang -> [Text] -> [Text] -> Bool -keyLEQ _ _ [] = False -keyLEQ _ [] _ = True -keyLEQ mblang (x:xs) (y:ys) = - case Unicode.comp mblang x y of - EQ -> keyLEQ mblang xs ys - GT -> False - LT -> True - data Layout a = Layout { layoutOptions :: LayoutOptions @@ -1604,8 +1578,11 @@ -- Abbreviations are substituted in the output when the variable -- and its content are matched by something in the abbreviations map. newtype Abbreviations = - Abbreviations (M.Map Variable (M.Map Text Text)) + Abbreviations (M.Map Variable (M.Map Variable Text)) deriving (Show, Eq, Ord) +-- NOTE: We use 'Variable' in the second map for the contents of the +-- variable, because we want it to be treated case-insensitively, +-- and we need a wrapper around 'CI' that has To/FromJSON instances. instance FromJSON Abbreviations where parseJSON = withObject "Abbreviations" $ \v -> @@ -1624,10 +1601,12 @@ then "number" else var) abbrevmap case val of - TextVal t -> maybe mzero (return . TextVal) $ M.lookup t abbrvs - FancyVal x -> maybe mzero (return . TextVal) $ M.lookup (toText x) abbrvs + TextVal t -> maybe mzero (return . TextVal) + $ M.lookup (toVariable t) abbrvs + FancyVal x -> maybe mzero (return . TextVal) + $ M.lookup (toVariable (toText x)) abbrvs NumVal n -> maybe mzero (return . TextVal) - $ M.lookup (T.pack (show n)) abbrvs + $ M.lookup (toVariable (T.pack (show n))) abbrvs _ -> mzero -- | Result of citation processing. @@ -1671,7 +1650,7 @@ , ("references", toJSON $ inputsReferences inp) , ("style", toJSON $ inputsStyle inp) , ("abbreviations", toJSON $ inputsAbbreviations inp) - , ("lang", toJSON $ inputsLang inp) + , ("lang", toJSON $ renderLang <$> inputsLang inp) ] instance (FromJSON a, Eq a) => FromJSON (Inputs a) where @@ -1680,5 +1659,11 @@ <*> v .:? "references" <*> v .:? "style" <*> v .:? "abbreviations" - <*> v .:? "lang" + <*> (do mbl <- v .:? "lang" + case mbl of + Nothing -> return Nothing + Just l -> + case parseLang l of + Left _ -> return Nothing + Right lang -> return $ Just lang) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/src/Citeproc/Unicode.hs new/citeproc-0.4/src/Citeproc/Unicode.hs --- old/citeproc-0.3.0.9/src/Citeproc/Unicode.hs 2020-12-09 06:12:12.000000000 +0100 +++ new/citeproc-0.4/src/Citeproc/Unicode.hs 2021-04-18 00:24:57.000000000 +0200 @@ -4,42 +4,21 @@ ( Lang(..), parseLang, renderLang, + lookupLang, toUpper, toLower, comp ) where +import Text.Collate.Lang (Lang(..), parseLang, renderLang, lookupLang) #ifdef MIN_VERSION_text_icu import qualified Data.Text.ICU as ICU #else -import qualified Data.RFC5051 as RFC5051 +import qualified Text.Collate as U #endif import Data.Text (Text) import qualified Data.Text as T -import Data.Aeson (FromJSON (..), ToJSON (..)) - --- | A parsed IETF language tag, with language and optional variant. --- For example, @Lang "en" (Just "US")@ corresponds to @en-US@. -data Lang = Lang{ langLanguage :: Text - , langVariant :: Maybe Text } - deriving (Show, Eq, Ord) - -instance ToJSON Lang where - toJSON = toJSON . renderLang - -instance FromJSON Lang where - parseJSON = fmap parseLang . parseJSON - --- | Render a 'Lang' an an IETF language tag. -renderLang :: Lang -> Text -renderLang (Lang l Nothing) = l -renderLang (Lang l (Just v)) = l <> "-" <> v - --- | Parse an IETF language tag. -parseLang :: Text -> Lang -parseLang t = Lang l (snd <$> T.uncons v) - where - (l,v) = T.break (\c -> c == '-' || c == '_') t +import Data.Maybe (fromMaybe) #ifdef MIN_VERSION_text_icu toICULocale :: Maybe Lang -> ICU.LocaleName @@ -53,12 +32,12 @@ ICU.toUpper (toICULocale mblang) #else toUpper mblang = T.toUpper . - case mblang of - Just (Lang "tr" _) -> T.map (\c -> case c of - 'i' -> '??' - '??' -> 'I' - _ -> c) - _ -> id + case langLanguage <$> mblang of + Just "tr" -> T.map (\c -> case c of + 'i' -> '??' + '??' -> 'I' + _ -> c) + _ -> id #endif toLower :: Maybe Lang -> Text -> Text @@ -67,18 +46,24 @@ ICU.toLower (toICULocale mblang) #else toLower mblang = T.toLower . - case mblang of - Just (Lang "tr" _) -> T.map (\c -> case c of - '??' -> 'i' - 'I' -> '??' - _ -> c) - _ -> id + case langLanguage <$> mblang of + Just "tr" -> T.map (\c -> case c of + '??' -> 'i' + 'I' -> '??' + _ -> c) + _ -> id #endif comp :: Maybe Lang -> Text -> Text -> Ordering #ifdef MIN_VERSION_text_icu comp mblang = ICU.collate (ICU.collator (toICULocale mblang)) #else -comp _mblang = RFC5051.compareUnicode +comp mblang = + let lang = fromMaybe (Lang "" Nothing Nothing [] [] []) mblang + coll = case lookup "u" (langExtensions lang) >>= lookup "ka" of + -- default to Shifted variable weighting, unless a variable + -- weighting is explicitly specified with the ka keyword: + Nothing -> U.setVariableWeighting U.Shifted $ U.collatorFor lang + Just _ -> U.collatorFor lang + in U.collate coll #endif - diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/stack.yaml new/citeproc-0.4/stack.yaml --- old/citeproc-0.3.0.9/stack.yaml 2021-03-13 20:22:21.000000000 +0100 +++ new/citeproc-0.4/stack.yaml 2021-04-26 06:41:23.000000000 +0200 @@ -4,7 +4,6 @@ icu: false resolver: lts-17.5 extra-deps: -- rfc5051-0.2 -- pandoc-types-1.22 +- unicode-collation-0.1.3 ghc-options: "$locals": -fhide-source-paths diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/test/extra/abbreviations_Basic.txt new/citeproc-0.4/test/extra/abbreviations_Basic.txt --- old/citeproc-0.3.0.9/test/extra/abbreviations_Basic.txt 2020-09-12 21:38:34.000000000 +0200 +++ new/citeproc-0.4/test/extra/abbreviations_Basic.txt 2021-04-18 01:01:22.000000000 +0200 @@ -40,7 +40,7 @@ >>===== ABBREVIATIONS =====>> { "default": { "title": { - "Journal of Classical Studies": "JoClSt" + "Journal of classical studies": "JoClSt" } } } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/citeproc-0.3.0.9/test/extra/issue61.txt new/citeproc-0.4/test/extra/issue61.txt --- old/citeproc-0.3.0.9/test/extra/issue61.txt 1970-01-01 01:00:00.000000000 +0100 +++ new/citeproc-0.4/test/extra/issue61.txt 2021-04-18 00:02:20.000000000 +0200 @@ -0,0 +1,66 @@ +Special treatment for literal "others" when it occurs as +the last name in the list, using localized et al. + +>>===== MODE =====>> +bibliography +<<===== MODE =====<< + + +>>===== RESULT =====>> +<div class="csl-bib-body"> + <div class="csl-entry">John Doe et al.</div> +</div> +<<===== RESULT =====<< + + +>>===== CSL =====>> +<style + xmlns="http://purl.org/net/xbiblio/csl" + class="note" + version="1.0"> + <info> + <id /> + <title /> + <updated>2009-08-10T04:49:00+09:00</updated> + </info> + <citation> + <layout> + <text value="Oops"/> + </layout> + </citation> + <bibliography> + <layout> + <group delimiter=", "> + <names variable="author"> + <name et-al-min="5" et-al-use-first="5"/> + </names> + </group> + </layout> + </bibliography> +</style> +<<===== CSL =====<< + + +>>===== INPUT =====>> +[ + { + "author": [ + { + "family": "Doe", + "given": "John" + }, + { + "literal": "others" + } + ], + "id": "ITEM-1", + "type": "book" + } +] +<<===== INPUT =====<< + + +>>===== VERSION =====>> +1.0 +<<===== VERSION =====<< +
