I think you are basing this idea on a couple of misunderstandings.
One is that formulas in a PDF are anything other than pictures which need to
be put through OCR to get anything useful. To indicate that this is not the
case, I searched for "physics formula" in Google, grabbed the first PDF I
found (
http://faculty.trinityvalleyschool.org/hoseltom/handouts/Formula%20Sheet-2003-05-07-8pg.pdf),
opened it in Acrobat, highlighted a formula for "Density = mass/volume"
which was written as something like:
{rho} = m
_ (unit:kg/m{exponent} 3)
v
and grabbed this formula using the Acrobat selection tool. In a text
window, this pasted as
(unit : kg /m3)
V
ñ = m
So that's one problem. However, the deeper problem is that _there is no
such thing as a standard mathematical notation_. Even on the single physics
formula page I looked at here, multiplication is represented both implicitly
by adjacent letters and explicitly by a big, vertically-centered dot.
Even on this one page, equality is parsed in different, inconsistent ways.
The intended meanings are clear from context and a familiarity with physics
but are ambiguous taken by themselves. I could go on and on - take a look at
http://www.jsoftware.com/jwiki/NYCJUG/MathematicalNotation for a little more
on this - but I won't. This inconsistency of notation is, in fact, part of
the reason Iverson created APL in the first place.
The upshot is that an idea like Dan's is probably more fruitful than this
notion of grabbing things off a PDF. Even then, you'll need to spend a fair
amount of time interpreting what you get.
For a look at how someone handles a lot of formulas and translates them into
J, see Tom Allen's pages starting at
http://www.jsoftware.com/jwiki/Essays/SpaceTime2D/SpaceTime2D01.
Good luck,
Devon
On Fri, Jun 19, 2009 at 8:24 PM, <[email protected]> wrote:
> I do not plan to use OCR.
>
> I am thinking more along the lines of cutting and pasting a
> section out of a
> Portable Document Format (pdf) file that represents in
> normal
> mathematical notation a formula.
>
> Acter doing the copy, use cut/paste buffer to generate
> equivalent j code.
>
> As I understand it ( probably wrong ) what is in the
> cut/paste buffer is a sequence
> of bytes which represents in pdf the formula. I am thinking
> that different formulas
> ( no matter how little or how big the difference ) have
> different bytes. So, no matter
> how difficult, one should be able to transcribe from pdf
> representation to j representation.
>
> I think it would be way cool (1960s euphemism) to go to a
> web page containing formula
> for Physics and copy a pdf version of a formula and then
> turn it into the j representation
> automatically.
>
> ----- Original Message Follows -----
> From: bill lam <[email protected]>
> To: [email protected]
> Subject: Re: [Jprogramming] math.pdf -> J Server -> math.ijs
> file
> Date: Fri, 19 Jun 2009 10:09:30 +0800
>
> >Except for the ocr part, looks similar to mathematica.
> >
> >btw the 'Quality' Web Email you used breaks every thread it
> >replies.
> >
> >--
> >regards,
> >====================================================
> >GPG key 1024D/4434BAB3 2008-08-24
> >gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
> >-----------------------------------------------------------
> >----------- For information about J forums see
> >http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
--
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm