Deriving mechanisms in GHC

2024-07-02 Thread Yao Li
Dear All,

My name is Yao Li. I am an assistant professor at Portland State
University. I am currently working with my Ph.D. student Nicholas Coltharp
on updating a tool called hs-to-coq (https://github.com/plclub/hs-to-coq)
to support GHC 8.10 and the corresponding base library.

However, we are running into some trouble involving the "deriving"
mechanisms in GHC. It seems the derived type class instances are no longer
generated by hs-to-coq after we updated GHC. I wonder what changes happened
between GHC 8.4 and 8.10 that could possibly lead to this issue? Or if we
hope to dig into this issue ourselves, what are some GHC resources that we
should look into? Any pointers would be much appreciated!

For context, hs-to-coq uses GHC's parser to parse Haskell code. Our
translation is based on the source code but on the GhcRn pass.

Thanks,
Yao Li, Ph.D., (He/Him)
Assistant Professor,
Computer Science Department, Portland State University
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Deriving mechanisms in GHC

2024-07-03 Thread Ryan Scott
(Disclaimer: I have never used hs-to-coq before, so everything below is
merely an educated guess.)

Looking at the source code for hs-to-coq, I see this [1]:

addDerivedInstances :: GhcMonad m => TypecheckedModule -> m
> TypecheckedModule
> addDerivedInstances tcm = do
> let Just (hsgroup, a, b, c) = tm_renamed_source tcm
>
> (_gbl_env, inst_infos, _rn_binds) <- initTcHack tcm $ do
> let tcg_env = fst (tm_internals_ tcm)
> tcg_env_hack = tcg_env { tcg_mod = fakeDerivingMod,
> tcg_semantic_mod = fakeDerivingMod }
> -- Set the module to make it look like we are in GHCi
> -- so that GHC will allow us to re-typecheck existing
> instances
> setGblEnv tcg_env_hack $
> #if __GLASGOW_HASKELL__ >= 810
> tcInstDeclsDeriv [] (hs_derivds hsgroup)
> #else
> tcInstDeclsDeriv [] (hs_tyclds hsgroup >>= group_tyclds)
> (hs_derivds hsgroup)
> #endif


If I understand this code correctly, it is using the GHC API to produce
deriving-generated code and typecheck it, which seems reasonable. The part
that confuses me is the #if __GLASGOW_HASKELL__ >= 810 bit. Prior to GHC
8.10, this code would obtain deriving-related information from two places:

   1. The deriving clauses of data type declarations (i.e., the hs_tyclds)
   2. Standalone deriving declarations (i.e., the hs_derivds)

In GHC 8.10 and later, however, the code only obtains deriving-related
information from standalone deriving declarations. This means that you'll
completely skip over any derived instances that arise from deriving
clauses, which is likely the source of the trouble you're encountering.

In order to give advice on what you *should* be doing, let me briefly
describe why the type of tcInstDeclsDeriv changed in GHC 8.10. The commit
that changed tcInstDeclsDeriv is [2]. Prior to that commit, tcInstDeclsDeriv
would obtain information related to deriving clauses via its first two
arguments:

   1. The first argument (of type [DerivInfo]) contained all of the
   deriving clauses for data family instances. (Note that hs-to-coq uses an
   empty list, so this means that hs-to-coq will always skip over data
   family instances, before or after GHC 8.10. I'm not sure if this should be
   considered as a separate bug.)
   2. The second argument (of type [LTyClDecl GhcRn]) contained all of the
   data type definitions, which might have deriving clauses attached to them.

The linked commit changes things so that *all* deriving clause–related
information (both for ordinary data types as well as data family instances)
is passed as a [DerivInfo] list. This means that hs-to-coq needs to produce
those DerivInfo values somehow. As inspiration, you might want to look at
how GHC calls tcInstDeclsDeriv here [3]. GHC first calls the
tcTyAndClassDecls function to produce the DerivInfo values, and then it
passes the DerivInfo values to tcInstDeclsDeriv. I would hope that
something similar would work for hs-to-coq.

Best,

Ryan
-
[1]
https://github.com/plclub/hs-to-coq/blob/03e823972fc7c5f85a300e554c691563f89a3e5f/src/lib/HsToCoq/Util/GHC/Deriving.hs#L50-L64
[2]
https://gitlab.haskell.org/ghc/ghc/-/commit/679427f878e50ba5a9981bac4c2f9c76f4de3c3c#4c1af4850cb90ab2963edb06b69b57e87ccdf9c1
[3]
https://gitlab.haskell.org/ghc/ghc/-/blob/bc1d435e399d8376b4e33d5d936424ff76cb686a/compiler/GHC/Tc/Module.hs#L1704-L1723
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Deriving mechanisms in GHC

2024-07-03 Thread Yao Li
Ah that looks like the reason. Thank you for catching that! Thanks for the
pointers to the specific commit in GHC as well as the example!

I am also curious if there is a recommended way to check the GHC API
changes across different versions. We use GHC APIs extensively in hs-to-coq
and there seem to be a number of changes between 8.4 and 8.10, so it's
possible that we will run into other similar issues at some point, or when
we update hs-to-coq to newer GHC versions in the future... Or are there any
community recommendations on the best way to maintain our code base, which
relies on GHC APIs?

Thanks,
Yao

On Wed, Jul 3, 2024 at 7:46 AM Ryan Scott  wrote:

> (Disclaimer: I have never used hs-to-coq before, so everything below is
> merely an educated guess.)
>
> Looking at the source code for hs-to-coq, I see this [1]:
>
> addDerivedInstances :: GhcMonad m => TypecheckedModule -> m
>> TypecheckedModule
>> addDerivedInstances tcm = do
>> let Just (hsgroup, a, b, c) = tm_renamed_source tcm
>>
>> (_gbl_env, inst_infos, _rn_binds) <- initTcHack tcm $ do
>> let tcg_env = fst (tm_internals_ tcm)
>> tcg_env_hack = tcg_env { tcg_mod = fakeDerivingMod,
>> tcg_semantic_mod = fakeDerivingMod }
>> -- Set the module to make it look like we are in GHCi
>> -- so that GHC will allow us to re-typecheck existing
>> instances
>> setGblEnv tcg_env_hack $
>> #if __GLASGOW_HASKELL__ >= 810
>> tcInstDeclsDeriv [] (hs_derivds hsgroup)
>> #else
>> tcInstDeclsDeriv [] (hs_tyclds hsgroup >>= group_tyclds)
>> (hs_derivds hsgroup)
>> #endif
>
>
> If I understand this code correctly, it is using the GHC API to produce
> deriving-generated code and typecheck it, which seems reasonable. The
> part that confuses me is the #if __GLASGOW_HASKELL__ >= 810 bit. Prior to
> GHC 8.10, this code would obtain deriving-related information from two
> places:
>
>1. The deriving clauses of data type declarations (i.e., the hs_tyclds)
>2. Standalone deriving declarations (i.e., the hs_derivds)
>
> In GHC 8.10 and later, however, the code only obtains deriving-related
> information from standalone deriving declarations. This means that you'll
> completely skip over any derived instances that arise from deriving
> clauses, which is likely the source of the trouble you're encountering.
>
> In order to give advice on what you *should* be doing, let me briefly
> describe why the type of tcInstDeclsDeriv changed in GHC 8.10. The commit
> that changed tcInstDeclsDeriv is [2]. Prior to that commit,
> tcInstDeclsDeriv would obtain information related to deriving clauses via
> its first two arguments:
>
>1. The first argument (of type [DerivInfo]) contained all of the
>deriving clauses for data family instances. (Note that hs-to-coq uses an
>empty list, so this means that hs-to-coq will always skip over data
>family instances, before or after GHC 8.10. I'm not sure if this should be
>considered as a separate bug.)
>2. The second argument (of type [LTyClDecl GhcRn]) contained all of
>the data type definitions, which might have deriving clauses attached to
>them.
>
> The linked commit changes things so that *all* deriving clause–related
> information (both for ordinary data types as well as data family instances)
> is passed as a [DerivInfo] list. This means that hs-to-coq needs to
> produce those DerivInfo values somehow. As inspiration, you might want to
> look at how GHC calls tcInstDeclsDeriv here [3]. GHC first calls the
> tcTyAndClassDecls function to produce the DerivInfo values, and then it
> passes the DerivInfo values to tcInstDeclsDeriv. I would hope that
> something similar would work for hs-to-coq.
>
> Best,
>
> Ryan
> -
> [1]
> https://github.com/plclub/hs-to-coq/blob/03e823972fc7c5f85a300e554c691563f89a3e5f/src/lib/HsToCoq/Util/GHC/Deriving.hs#L50-L64
> [2]
> https://gitlab.haskell.org/ghc/ghc/-/commit/679427f878e50ba5a9981bac4c2f9c76f4de3c3c#4c1af4850cb90ab2963edb06b69b57e87ccdf9c1
> [3]
> https://gitlab.haskell.org/ghc/ghc/-/blob/bc1d435e399d8376b4e33d5d936424ff76cb686a/compiler/GHC/Tc/Module.hs#L1704-L1723
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Deriving mechanisms in GHC

2024-07-03 Thread Simon Peyton Jones
> I am also curious if there is a recommended way to check the GHC API
changes across different versions.

The difficulty is that we don't *have* a published, well designed GHC API
for clients to use, and for GHC devs to strive to maintain stable.  You
aren't the only one to find this problem see these tickets
.
You might want to turn this thread into a new ticket, so you get included
in that list.

There is a project to design the API, but it is stalled
https://discourse.haskell.org/t/charting-a-course-toward-a-stable-api-for-ghc/7646

It would be fantastic if someone, or a small group, was willing to take the
lead here.

Simon


On Wed, 3 Jul 2024 at 20:02, Yao Li  wrote:

> Ah that looks like the reason. Thank you for catching that! Thanks for the
> pointers to the specific commit in GHC as well as the example!
>
> I am also curious if there is a recommended way to check the GHC API
> changes across different versions. We use GHC APIs extensively in hs-to-coq
> and there seem to be a number of changes between 8.4 and 8.10, so it's
> possible that we will run into other similar issues at some point, or when
> we update hs-to-coq to newer GHC versions in the future... Or are there any
> community recommendations on the best way to maintain our code base, which
> relies on GHC APIs?
>
> Thanks,
> Yao
>
> On Wed, Jul 3, 2024 at 7:46 AM Ryan Scott  wrote:
>
>> (Disclaimer: I have never used hs-to-coq before, so everything below is
>> merely an educated guess.)
>>
>> Looking at the source code for hs-to-coq, I see this [1]:
>>
>> addDerivedInstances :: GhcMonad m => TypecheckedModule -> m
>>> TypecheckedModule
>>> addDerivedInstances tcm = do
>>> let Just (hsgroup, a, b, c) = tm_renamed_source tcm
>>>
>>> (_gbl_env, inst_infos, _rn_binds) <- initTcHack tcm $ do
>>> let tcg_env = fst (tm_internals_ tcm)
>>> tcg_env_hack = tcg_env { tcg_mod = fakeDerivingMod,
>>> tcg_semantic_mod = fakeDerivingMod }
>>> -- Set the module to make it look like we are in GHCi
>>> -- so that GHC will allow us to re-typecheck existing
>>> instances
>>> setGblEnv tcg_env_hack $
>>> #if __GLASGOW_HASKELL__ >= 810
>>> tcInstDeclsDeriv [] (hs_derivds hsgroup)
>>> #else
>>> tcInstDeclsDeriv [] (hs_tyclds hsgroup >>= group_tyclds)
>>> (hs_derivds hsgroup)
>>> #endif
>>
>>
>> If I understand this code correctly, it is using the GHC API to produce
>> deriving-generated code and typecheck it, which seems reasonable. The
>> part that confuses me is the #if __GLASGOW_HASKELL__ >= 810 bit. Prior
>> to GHC 8.10, this code would obtain deriving-related information from
>> two places:
>>
>>1. The deriving clauses of data type declarations (i.e., the hs_tyclds
>>)
>>2. Standalone deriving declarations (i.e., the hs_derivds)
>>
>> In GHC 8.10 and later, however, the code only obtains deriving-related
>> information from standalone deriving declarations. This means that you'll
>> completely skip over any derived instances that arise from deriving
>> clauses, which is likely the source of the trouble you're encountering.
>>
>> In order to give advice on what you *should* be doing, let me briefly
>> describe why the type of tcInstDeclsDeriv changed in GHC 8.10. The
>> commit that changed tcInstDeclsDeriv is [2]. Prior to that commit,
>> tcInstDeclsDeriv would obtain information related to deriving clauses
>> via its first two arguments:
>>
>>1. The first argument (of type [DerivInfo]) contained all of the
>>deriving clauses for data family instances. (Note that hs-to-coq uses an
>>empty list, so this means that hs-to-coq will always skip over data
>>family instances, before or after GHC 8.10. I'm not sure if this should be
>>considered as a separate bug.)
>>2. The second argument (of type [LTyClDecl GhcRn]) contained all of
>>the data type definitions, which might have deriving clauses attached to
>>them.
>>
>> The linked commit changes things so that *all* deriving clause–related
>> information (both for ordinary data types as well as data family instances)
>> is passed as a [DerivInfo] list. This means that hs-to-coq needs to
>> produce those DerivInfo values somehow. As inspiration, you might want
>> to look at how GHC calls tcInstDeclsDeriv here [3]. GHC first calls the
>> tcTyAndClassDecls function to produce the DerivInfo values, and then it
>> passes the DerivInfo values to tcInstDeclsDeriv. I would hope that
>> something similar would work for hs-to-coq.
>>
>> Best,
>>
>> Ryan
>> -
>> [1]
>> https://github.com/plclub/hs-to-coq/blob/03e823972fc7c5f85a300e554c691563f89a3e5f/src/lib/HsToCoq/Util/GHC/Deriving.hs#L50-L64
>> [2]
>> https://gitlab.haskell.org/ghc/ghc/-/commit/679427f878e50ba5a9981bac4c2f9c76f4de3c3c#4c1af4850cb90ab2963edb06b69b57e87ccdf9c1
>> [3]
>> https://gitlab.haskell.org/

Re: Deriving mechanisms in GHC

2024-07-03 Thread Yao Li
Hi Simon,

Thanks for all these pointers!


> You aren't the only one to find this problem see these tickets.  You might
> want to turn this thread into a new ticket, so you get included in that
> list.
>
Good to know! From the list, I see that the LiquidHaskell team also ran
into the same issue and they proposed this ticket, which I should thumb up:
https://gitlab.haskell.org/ghc/ghc/-/issues/24096

It would be fantastic if someone, or a small group, was willing to take the
> lead here.

Indeed!

Best,
Yao
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs