RE: analyzer for Code

2013-06-17 Thread Gian Maria Ricci
Subject: Re: analyzer for Code Gian, Lucene in Action has a case study from Krugle about their analysis for a code search engine, if you want to look there. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Jun 13, 2013 at 4:19 AM, Gian Maria Ricci wrote: > I did a little

RE: analyzer for Code

2013-06-17 Thread Gian Maria Ricci
I'll have a look to it, thanks to everyone. -- Gian Maria Ricci Mobile: +39 320 0136949 -Original Message- From: Steve Rowe [mailto:sar...@gmail.com] Sent: Thursday, June 13, 2013 9:03 PM To: solr-user@lucene.apache.org Subject: Re: analyzer for Code Hi Gian Maria, Ope

Re: analyzer for Code

2013-06-13 Thread Otis Gospodnetic
Gian, Lucene in Action has a case study from Krugle about their analysis for a code search engine, if you want to look there. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Jun 13, 2013 at 4:19 AM, Gian Maria Ricci wrote: > I did a little search around and did not find an

Re: analyzer for Code

2013-06-13 Thread Steve Rowe
e. J > > -- > Gian Maria Ricci > Mobile: +39 320 0136949 > > > > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Thursday, June 13, 2013 1:24 PM > To: solr-user@lucene.apache.org; Gian Maria Ricci > Subject: Re: analyzer for Code > > We

RE: analyzer for Code

2013-06-13 Thread Gian Maria Ricci
Gian Maria Ricci Subject: Re: analyzer for Code Well, WordDelimiterFilterFactory would split on the punctuation, so you could add it to the analyzer chain along with StandardAnalyzer. You could use one of the regex filters to break up tokens that make it through the analyzer as you see fit.

Re: analyzer for Code

2013-06-13 Thread Walter Underwood
It could be pretty complicated to do well. I'm pretty sure that Krugle is based on Solr: http://opensearch.krugle.org/ You might also look at the UI for Ohloh (used to be Koders): http://code.ohloh.net/ wunder On Jun 13, 2013, at 1:19 AM, Gian Maria Ricci wrote: > I did a little search around

Re: analyzer for Code

2013-06-13 Thread Erick Erickson
Well, WordDelimiterFilterFactory would split on the punctuation, so you could add it to the analyzer chain along with StandardAnalyzer. You could use one of the regex filters to break up tokens that make it through the analyzer as you see fit. But in general, this will be a bunch of compromises s

analyzer for Code

2013-06-13 Thread Gian Maria Ricci
I did a little search around and did not find anything interesting. Anyone know if some analyzers exists to better index source code (es C#, C++. Java etc)? Standard analyzer is quite good, but I wish to know if there are some more specific analyzers that can do a better indexing. Es I did a li