Re: [Wiki-research-l] Wikipedia mathematical search engine

2012-04-03 Thread Jozef Misutka
Hi Daniel,


On Tue, Apr 3, 2012 at 12:15 AM, Daniel Mietchen <
daniel.mietc...@googlemail.com> wrote:

> Hi Jozef,
>
> I just played around a bit and liked what I saw, though I didn't see
> much, as the site was very slow.
>

it was a HW failure (RAID5 I think...). Anyway, it was fixed several hours
ago.



>
> How did you strip the dump of the non-mathematical articles?


Very simply: a mathematical article is an article which contains
"</math" inside.

I do not claim to have a perfect Wikipedia tag parser but the vast majority
of the formulae in Wikipedia are typeset using standard Wikipedia rules and
are simply inside text which is fine.


I am
> asking because one of the major uses that I have in mind for a good
> mathematical search engine would be to identify areas around topic A
> (say, theoretical biology) that use the same concepts as those in
> topic B (say, economics). Very often such distant fields are only
> weakly connected, but solutions or approaches that work in one of them
> are not infrequently transferable.


That is exactly one of the interesting applications for a mathematical
search engine.

I wanted to reply to you with something interesting, so I called my friend
asking him about interesting formulae from economy. He told me about
Vasicek model, so I tried to search for the formula
dr_t = a(b-r_t) dt + \sigma dW_t
which resulted in 2 hits at no abstraction level - no big deal. But then I
tried to abstract it and another hit came which is imho interesting
(different variables used but the same formula).

Vasicek 
model<http://egomath.projekty.ms.mff.cuni.cz/index.php?q=&math=dr_t+%3D+a%28b-r_t%29+dt+%2B+%5Csigma+dW_t&hide_snippets=0>
Vasicek model and
similar<http://egomath.projekty.ms.mff.cuni.cz/index.php?q=&math=dr_t+%3D+a%28b-r_t%29+dt+%2B+%5Csigma+dW_t&level=9&hide_snippets=0>



> In order to be useful for such
> purposes, your corpus would still have to contain the economics/
> theoretical biology articles (at least those that use equations), but
> I couldn't find evidence for that.
>

See the number of documents (and categories) when you search for simple
text e.g.,
economy
http://egomath.projekty.ms.mff.cuni.cz/index.php?math=&q=economy
biology
http://egomath.projekty.ms.mff.cuni.cz/index.php?math=&q=biology

Jozef


>
> Daniel
>
> On Mon, Apr 2, 2012 at 2:21 PM, Jozef Misutka 
> wrote:
> > Hi,
> >
> > I want to introduce a *mathematical* search engine working over English
> > Wikipedia dump. The key advantage is simple - *it works* ;).
> > Better than a nice speech is a real demo which can be found here:
> > http://egomath.projekty.ms.mff.cuni.cz
> >
> > If you are somehow interested or just want to share your thoughts do not
> > hesitate to contact me.
> >
> > Best regards,
> > Jozef Misutka
> > __
> > Charles University in Prague,
> > Department of Software Engineering,
> > www: http://www.ksi.mff.cuni.cz/cs/~misutka
> >
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Wikipedia mathematical search engine

2012-04-02 Thread Jozef Misutka
Hi,

I want to introduce a *mathematical* search engine working over English
Wikipedia dump. The key advantage is simple - *it works* ;).
Better than a nice speech is a real demo which can be found here:
http://egomath.projekty.ms.mff.cuni.cz

If you are somehow interested or just want to share your thoughts do not
hesitate to contact me.

Best regards,
Jozef Misutka
__
Charles University in Prague,
Department of Software Engineering,
www: http://www.ksi.mff.cuni.cz/cs/~misutka
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l