In reviewing a PR related to units, I found ChatGPT to get correct the idea that a foot is bigger than an inch, but it said that a volt is bigger than a statvolt (see quoted GPT response [here](https://github.com/sympy/sympy/pull/24325#issuecomment-1354343306)).
/c On Thursday, December 15, 2022 at 1:58:33 PM UTC-6 Aaron Meurer wrote: > The trend with LLMs is much less structured. It doesn't use any formalism. > It just guesses the next character of the input based on training on > billions of examples. > > That's why I think that tools like SymPy that are more structured can be > useful. GPT can already write SymPy code pretty well, much better than it > can do the actual mathematics. It may be as simple as automatically > appending "and write SymPy code to verify this" to the end of a prompt > whenever it involves mathematics. This sort of approach has already been > proven to be able to solve university math problems (see > https://www.pnas.org/doi/pdf/10.1073/pnas.2123433119, where they > literally just take the input problem and prepend "use sympy" and the > neural network model does the rest). > > Aaron Meurer > > > > On Thu, Dec 15, 2022 at 2:21 AM S.Y. Lee <syle...@gmail.com> wrote: > >> > My hope is that tools like SymPy can be used as oracles for tools like >> GPT to help them verify their mathematics. >> >> In the most general context, "correct mathematics" can also be considered >> some "grammar". >> So there should be some grammar between Type-0 grammar to Type-1 grammar >> in Chomsky hierarchy <https://en.wikipedia.org/wiki/Chomsky_hierarchy>. >> In this context, a parser, or a parser with sympy oracle is the solution >> for such problem, >> such that any other ideas to solve such problem can be isomorphic to. >> >> However, building up such parser is off-direction for the researches of >> deep learning itself, >> because it would need a lot of efforts by experts, to interpret the >> sentence generated by GPT, >> and design a phrase structure grammar for it. >> >> I also thought about an idea that they can just tag arithmetics using SKI >> combinator calculus >> <https://en.wikipedia.org/wiki/SKI_combinator_calculus>. >> In this way, there is no wrong arithmetics for every random sequence of >> alphabets. >> >> However, I'm not sure that if this idea is already refuted by such >> contemporary researchers >> because it should be pretty much well-known. >> >> On Thursday, December 15, 2022 at 12:45:53 AM UTC+2 asme...@gmail.com >> wrote: >> >>> I've seen many examples on Twitter and elsewhere of chatgpt doing >>> terribly on actual mathematics. I instead asked it to "Find the extreme >>> values of the function x^3 + 5x^2 + 3x - 9". It got the first part with the >>> derivative and the quadratic equation correct, but then did faulty >>> arithmetic when plugging the values back into the expression (the correct >>> values are 0 and -256/27). It makes a sign error and also gets the addition >>> wrong even with the wrong sign. >>> >>> My hope is that tools like SymPy can be used as oracles for tools like >>> GPT to help them verify their mathematics. Obviously in this case you don't >>> need SymPy to just verify arithmetic, but I've also seen it do just as >>> poorly on symbolic algebra problems. >>> >>> I've also pasted the full log below: >>> >>> [image: Screen Shot 2022-12-14 at 3.41.32 PM.png] >>> [image: Screen Shot 2022-12-14 at 3.41.42 PM.png] >>> >>> To find the extreme values of a function, we need to find the critical >>> points of the function, which are the points at which the derivative of the >>> function is equal to 0 or does not exist. >>> >>> The derivative of the function x^3 + 5x^2 + 3x - 9 is 3x^2 + 10x + 3. To >>> find the critical points, we need to solve the equation 3x^2 + 10x + 3 = 0. >>> >>> To solve this equation, we can use the quadratic formula: >>> >>> x = (-b ± √(b^2 - 4ac)) / 2a >>> >>> where a, b, and c are the coefficients of the quadratic equation ax^2 + >>> bx + c = 0. In our case, a = 3, b = 10, and c = 3, so the formula becomes: >>> >>> x = (-10 ± √(10^2 - 4 * 3 * 3)) / 2 * 3 >>> >>> = (-10 ± √(100 - 36)) / 6 >>> >>> = (-10 ± √(64)) / 6 >>> >>> = (-10 ± 8) / 6 >>> >>> = (-18, -2) / 6 >>> >>> = (-3, -1/3) >>> >>> These are the critical points of the function x^3 + 5x^2 + 3x - 9. >>> >>> To find the extreme values of the function, we need to evaluate the >>> function at these critical points and at any points where the derivative >>> does not exist. In this case, the derivative exists for all values of x, so >>> we only need to evaluate the function at the critical points. >>> >>> If we plug in the critical points into the function, we get: >>> >>> x^3 + 5x^2 + 3x - 9 = (-3)^3 + 5(-3)^2 + 3(-3) - 9 = -27 + 45 + 9 - 9 = 8 >>> >>> and >>> >>> x^3 + 5x^2 + 3x - 9 = (-1/3)^3 + 5(-1/3)^2 + 3(-1/3) - 9 = -1/27 + 5/9 + >>> 1 - 9 = 5/9 - 1/27 - 8/9 = -64/27 >>> >>> The extreme values of the function x^3 + 5x^2 + 3x - 9 are 8 and -64/27. >>> >>> On Wed, Dec 14, 2022 at 3:35 PM gu...@uwosh.edu <gu...@uwosh.edu> wrote: >>> >>>> Just for some additional perspective. I have also tried this on some >>>> general chemistry word problems. In general, I see it getting basic one >>>> logical step processes correct (e.g. a single step dilution or grams -> >>>> moles). Things with multiple steps or requiring understanding the physical >>>> situation it does poorly on. That said, I think it does better than some >>>> of >>>> my weakest students. It does not seem to be able to use significant >>>> figures >>>> in computations (also a problem for my weaker students). >>>> >>>> It seems to be improving rapidly. If it can get to reliably >>>> differentiating between correct (workable) solutions and erroneous ones, >>>> it >>>> will be more useful to most people (including my students) than searches >>>> of >>>> the internet or a cheating sight such as Chegg. >>>> >>>> My two cents worth of opinion. >>>> >>>> Jonathan >>>> >>>> On Wednesday, December 14, 2022 at 4:28:05 PM UTC-6 Francesco Bonazzi >>>> wrote: >>>> >>>>> [image: chatgpt.sympy.matrix_diag.png] >>>>> >>>>> On Wednesday, December 14, 2022 at 11:26:37 p.m. UTC+1 Francesco >>>>> Bonazzi wrote: >>>>> >>>>>> Not everything is perfect... ChatGPT misses the *convert_to( ... ) >>>>>> *function >>>>>> in *sympy.physics.units*, furthermore, the given code does not work: >>>>>> >>>>>> [image: chatgpt.sympy.unit_conv.png] >>>>>> >>>>>> On Wednesday, December 14, 2022 at 11:24:29 p.m. UTC+1 Francesco >>>>>> Bonazzi wrote: >>>>>> >>>>>>> [image: chatgpt.sympy.logical_inference.png] >>>>>>> >>>>>>> On Wednesday, December 14, 2022 at 11:23:43 p.m. UTC+1 Francesco >>>>>>> Bonazzi wrote: >>>>>>> >>>>>>>> https://en.wikipedia.org/wiki/ChatGPT >>>>>>>> >>>>>>>> Some tested examples attached as pictures to this post. Quite >>>>>>>> impressive... >>>>>>>> >>>>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "sympy" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to sympy+un...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/sympy/6af62b19-1fb0-4681-9fd2-5e5fccfcb46fn%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/sympy/6af62b19-1fb0-4681-9fd2-5e5fccfcb46fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "sympy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to sympy+un...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/sympy/74847ca3-124b-414d-aa36-01eb91096310n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/sympy/74847ca3-124b-414d-aa36-01eb91096310n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/35e93e14-4103-47c8-8a0c-2fc82ee69487n%40googlegroups.com.