Re: Idea for personal Clojure project

2010-07-31 Thread Martin DeMello
On Sat, Jul 31, 2010 at 5:41 AM, Gregg Williams greg...@innerpaths.net wrote: I've begun work on a visual front-end to display such infocards, using Clojure and the Piccolo graphics library (http://piccolo2d.org/). If you (or anybody else reading this) find this larger project interesting,

Re: Idea for personal Clojure project

2010-07-30 Thread Gregg Williams
Daniel (and anyone else reading this) I would like to correspond with you because I'm working on a project for which your word graphing is a subset. I invented a standardized electronic notecard (see http://infoml.org), with the idea that writers and others could dump chunks of information (with

Re: Idea for personal Clojure project

2010-07-29 Thread Jonah Benton
As others have said, there isn't an algorithm that does this. Useful results depend on precise definitions of context and similarity. The waters get deep quickly. As a clojure exercise, though, there are lots of good starting points. For instance: get a set of words, create all pairs from the

Re: Idea for personal Clojure project

2010-07-29 Thread Savanni D'Gerinel
What you describe is not clojure specific, so... Check out the NLTK project. It is all in Python, and all of the big book are written for learning to use the tools in Python. However, it also contains a lot of talk about Natural Language Processing in general. http://www.nltk.org/book I,

Re: Idea for personal Clojure project

2010-07-29 Thread Lee Hinman
On Wed, Jul 28, 2010 at 2:58 PM, Daniel doubleagen...@gmail.com wrote: I want to write a clojure program that searches for similarities of words in the english language and places them in a graph, where the distance between nodes indicates their similarity.  I don't mean syntactical

Re: Idea for personal Clojure project

2010-07-29 Thread lance bradley
I've done quite a lot of work in this area, although not in clojure. As Mark mentioned, wordnet is definitely a good place to start, but it's short on proper nouns, which reduces the utility of this when analyzing natural language. I ended up extending wordnet by data mining wikipedia dumps. The

Re: Idea for personal Clojure project

2010-07-29 Thread bOR_
I think there were some talks about this on the conference I went to recently. Keywords might be natural language processing. Linked is the abstracts of the conference, which you might find some use in. http://www.insna.org/PDF/Sunbelt/4_ProgramPDF.pdf One alternative I briefly considered is to

Re: Idea for personal Clojure project

2010-07-29 Thread Michael Harrison (goodmike)
As others have said, this is a difficult problem, but a fascinating one too. I'm currently nibbling on building some grouping-by- similarity algorithms for Clojure, although I'm sticking to numerical criteria for similarity or distance. New developments in text analysis and the Learning by Reading

Re: Idea for personal Clojure project

2010-07-29 Thread rob levy
I think that a big part of the problem is that most approaches to word similarity (especially thesaurus-based approaches like Wordnet, but also the significantly better distributional approaches) use very impoverished representations of knowledge. As such, they are unable to make useful

Re: Idea for personal Clojure project

2010-07-29 Thread rob levy
I think that a big part of the problem is that most approaches to word similarity (especially thesaurus-based approaches like Wordnet, but also the significantly better distributional approaches) use very impoverished representations of knowledge. As such, they are unable to make useful

Re: Idea for personal Clojure project

2010-07-29 Thread Savanni D'Gerinel
On Thu, 2010-07-29 at 10:11 -0400, rob levy wrote: Also, most of NLTK works in Jython*, and by extension in Jython running in Clojure ( which is why I started writing a convenience wrapper to make it easier to use python libraries: http://code.google.com/p/clojure-python/ ). *Actually

Idea for personal Clojure project

2010-07-28 Thread Daniel
I want to write a clojure program that searches for similarities of words in the english language and places them in a graph, where the distance between nodes indicates their similarity. I don't mean syntactical similarity. Related contextual meaning is closer to the mark. For instance: fish

Re: Idea for personal Clojure project

2010-07-28 Thread Mark Engelberg
Wordnet is the main existing thing that comes to mind as related to your idea. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be

Re: Idea for personal Clojure project

2010-07-28 Thread Luke VanderHart
This is a hard problem. If you go by degrees and shades of synonymity, it can (and has been) done manually - see Visual Thesaurus (http:// www.visualthesaurus.com/). But for grouping based on the same semantic topics - that's pretty difficult. You could do it based on co-location in a corpus,

Re: Idea for personal Clojure project

2010-07-28 Thread Daniel E. Renfer
On 7/28/10 5:34 PM, Mark Engelberg wrote: Wordnet is the main existing thing that comes to mind as related to your idea. You might also want to look into Freebase. Here's a Clojure client you can use to query their data. http://github.com/rnewman/clj-mql signature.asc Description: OpenPGP

Re: Idea for personal Clojure project

2010-07-28 Thread Cameron Pulsford
A very good place to start searching about edit distances between words and some related stuff can be found on Peter Norvigs site at: http://norvig.com/spell-correct.html Also, try to find some wikipedia articles about the bm25 ranking algorithm, I used clojure for an assignment at school that