Re: phrase segmentation plugin in component, analyzer, filter or parser?

2010-03-24 Thread Erik Hatcher
On Mar 24, 2010, at 1:35 AM, Tommy Chheng wrote: I'm writing an experimental phrase segmentation plugin for solr. My current plan is to write as a SearchComponent by overriding the queryString with the new grouped query. ex. (university of california irvine 2009) will be re-written to

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
There's a match between the query and the content of field I want to highlight on. Solr is giving me the id of the document matching my query, but it's not displaying the field I want to highlight on. Here's the definition of the field I want to highlight on:field name=title type=string

Re: Cannot fetch urls with target=_blank

2010-03-24 Thread Stefano Cherchi
Right. Sorry for the OT. S -- Anyone proposing to run Windows on servers should be prepared to explain what they know about servers that Google, Yahoo, and Amazon don't. Paul Graham A mathematician is a device for turning coffee into theorems. Paul Erdos (who

Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-24 Thread Grant Ingersoll
On Mar 23, 2010, at 7:29 PM, brad anderson wrote: I see, so when you do a commit it adds it to Zoie's ramdirectory. So, could you just commit after every document without having a performance impact and have real time search? Not likely, maybe on really, really small indexes. Zoie also

Re: Issue w/ highlighting a String field

2010-03-24 Thread Ahmet Arslan
There's a match between the query and the content of field I want to highlight on. Solr is giving me the id of the document matching my query, but it's not displaying the field I want to highlight on. Here's the definition of the field I want to highlight on:        field name=title

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
2010/3/24 Ahmet Arslan iori...@yahoo.com There's a match between the query and the content of field I want to highlight on. Solr is giving me the id of the document matching my query, but it's not displaying the field I want to highlight on. Here's the definition of the field I want

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-24 Thread stocki
okay, thx i installed ant and want to build with ant. but java cannot compile, because all the lucene files missed ... !? package org.apache.lucene.search does not exist and more... did i checkout the wrong trunk ? .../lucee/dev/solr/trunk Lance Norskog-2 wrote: You need 'ant' to do

Re: Issue w/ highlighting a String field

2010-03-24 Thread Ahmet Arslan
I don't have defaultSearchField, instead, I have the following qf clause, where title_tokenized is a tokenized version of title         str name=qf title_tokenized^3 text_description_tokenized phonetic_text^0.5/str I didn't know that you are using dismax. In your query fields list there

Re: Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-24 Thread Constantijn Visinescu
Don't know about other servlet containers, but i can confirm Resin 3 breaks if you try to load 2 completely independent webapps into it that both use solr with jmx enabled. I also had a similar issue with Blaze DS (library for flash remoting that I'm using to power the UI for my webapp), but

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
I didn't know that you are using dismax. In your query fields list there is no title field. Probably match is coming from title_tokenized, and when you request highlighting from title (hl.fl=title) it returns empty snippets. If thats the case it is pretty expected because string typed fields

Re: Configuring multiple SOLR apps to play nice with MBeans / JMX

2010-03-24 Thread Constantijn Visinescu
it would probably be pretty trivial to add if you want to take a stab at a patch for it. -Hoss *stab* https://issues.apache.org/jira/browse/SOLR-1843 :)

wikipedia and teaching kids search engines

2010-03-24 Thread Erik Hatcher
I've got a couple of questions for the community... * what's the simplest way to get Solr up and running with a relatively richly schema'd index of a Wikipedia dump? What I'm looking for is something as easy as something along these lines: java -Dsolr.solr.home=./wikipedia_solr_home

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Mattmann, Chris A (388J)
Hey Erik, One thing to think about (and I'm no expert at middle school kids) would be to relate search somehow to a topic they are interested in. My 12 year old nephew loves the NBA, so if I were to talk to him about search, I would try and relate it to e.g., NBA.com, or understanding the

multiple binary documents into a single solr document - Vignette/OpenText integration

2010-03-24 Thread Fábio Aragão da Silva
hello there, I'm working on the development of a piece of code that integrates Solr with Vignette/OpenText Content Management, meaning Vignette content instances will be indexed in solr when published and deleted from solr when unpublished. I'm using solr 1.4, solrj and solr cell. I've

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Christopher Laux
Hi Erik, I'm working on Wikipedia search and use Solr. Afaik it can't easily be done. The Wikipedia XML dump only provided the page title and author in terms of data one would search for. The rest requires parsing the Mediawiki markup for which there is no good one freely available (still writing

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Markus Jelsma
A bit off-topic but how about Nutch grabbing some conent and have it indexed in Solr? On Wednesday 24 March 2010 16:08:43 Christopher Laux wrote: Hi Erik, I'm working on Wikipedia search and use Solr. Afaik it can't easily be done. The Wikipedia XML dump only provided the page title and

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
2010/3/24 Ahmet Arslan iori...@yahoo.com With this configuration, the title field is highlighted only when there's a perfect match, i.e., the quoted query equals the title content (f.i., q=Terrain sehloul allows highlighting the entire title containing Terrain sehloul, Exactly.

How do I create a solr core with the data from an existing one?

2010-03-24 Thread Steve Dupree
*Solr 1.4 Enterprise Search Server* recommends doing large updates on a copy of the core, and then swapping it in for the main core. I tried following these steps: 1. Create prep core: http://localhost:8983/solr/admin/cores?action=CREATEname=prepinstanceDir=main 2. Perform index update,

Re: How do I create a solr core with the data from an existing one?

2010-03-24 Thread gwk
Hi, I'm not sure if it's the best option but you could use replication to copy the index (http://wiki.apache.org/solr/SolrReplication). As long as you core is configured as a master you can use the fetchindex command to do a one-time replication from the new core (see the HTTP API section in

Re: Impossible Boost Query?

2010-03-24 Thread blargy
This sound a little closer to what I want but I don't want fully randomized results. How exactly does this field work? Is it more than just a simple random sort (order by rand())? What would be nice is if I could randomize documents within a certain score percentage of each other. Is this

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Erick Erickson
Erik: In a former incarnation, I thought I was going to teach 6th graders. Until I found out I can't deal with 25 kids for 6 hours at a stretch for years on end My thoughts, presented in a feel free to ignore but this is what I'd do spirit. There are some random thoughts below, but here's

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-24 Thread stocki
hey. i got it =) i checked out with lucene and the build from solr. with ant -verbose example. now, when i put this line into solrconfig: str name=classnameorg.apache.solr.spelling.suggest.Suggester/str no exception occurs =) juhu but how wokrs this component ?? sorry for a new stupid

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Walter Underwood
This is brilliant. I love it! Is a computer game a document? How about each level, each room, each player? If you want some fancy linguistics besides stemming, try compounding or what I call one word or two? English loves to glom words together. schoolroom or school room? babysitter,

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Andrzej Bialecki
On 2010-03-24 16:15, Markus Jelsma wrote: A bit off-topic but how about Nutch grabbing some conent and have it indexed in Solr? The problem is not with collecting and submitting the documents, the problem is with parsing the Wikimedia markup embedded in XML. WikipediaTokenizer from Lucene

Re: Issue w/ highlighting a String field

2010-03-24 Thread Ahmet Arslan
Thank a lot Ahmet. In addition, I want to highlight phrases containing stop words. I guess that the best way is to use a tokenized type without stopwordFilter. Do you agree with me defining a new type for this purpose ? I am not sure about that. May be solr.CommonGramsFilterFactory can do

Re: multiple binary documents into a single solr document - Vignette/OpenText integration

2010-03-24 Thread Andrzej Bialecki
On 2010-03-24 15:58, Fábio Aragão da Silva wrote: hello there, I'm working on the development of a piece of code that integrates Solr with Vignette/OpenText Content Management, meaning Vignette content instances will be indexed in solr when published and deleted from solr when unpublished. I'm

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
2010/3/24 Ahmet Arslan iori...@yahoo.com Thank a lot Ahmet. In addition, I want to highlight phrases containing stop words. I guess that the best way is to use a tokenized type without stopwordFilter. Do you agree with me defining a new type for this purpose ? I am not sure about

Re: Issue w/ highlighting a String field

2010-03-24 Thread Ahmet Arslan
Yes, that's what I was expecting. Actually, I'd like to highlight phrases containing stopwords, like emTerrain à sehloul/em Lucene's FastVectorHighlighter[1] can do that kind of phrase highlighting. It seems that solr integration [2] has finished. You need to apply SOLR-1268 patch.

Re: Issue w/ highlighting a String field

2010-03-24 Thread Saïd Radhouani
Thanks a lot Ahmet. Now I'm gonna learn new thing: how to apply a new patch :) Cheers. 2010/3/24 Ahmet Arslan iori...@yahoo.com Yes, that's what I was expecting. Actually, I'd like to highlight phrases containing stopwords, like emTerrain à sehloul/em Lucene's FastVectorHighlighter[1]

Re: If you could have one feature in Solr...

2010-03-24 Thread Teruhiko Kurosaka
(Sorry for very late response on this topic.) On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote: - langage attribute for each field I was thinking about it and it was one of my wishes. Currently, Solr practically requires that we have a field for each natural language that an application

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Chris Hostetter
: My goal is to index wikipedia in order to demonstrate search to a class of : middle school kids that I've volunteered to teach for a couple of hours. : Which brings me to my next question... twitter data is a little easier to ingest easily then the wikipedia markup (the json based streaming

Re: If you could have one feature in Solr...

2010-03-24 Thread Dennis Gearon
Most databases only RECENTLY have set up langauges per column. Languages per ENTRY in a column? I don't think any support that yet. How would you get that information from a database with the corresponding language attribute? Dennis Gearon Signature Warning EARTH has a Right

Field Collapsing SOLR-236

2010-03-24 Thread blargy
Has anyone had any luck with the field collapsing patch (SOLR-236) with Solr 1.4? I tried patching my version of 1.4 with no such luck. Thanks -- View this message in context: http://old.nabble.com/Field-Collapsing-SOLR-236-tp28019949p28019949.html Sent from the Solr - User mailing list

Re: wikipedia and teaching kids search engines

2010-03-24 Thread Grant Ingersoll
On Mar 24, 2010, at 1:53 PM, Andrzej Bialecki wrote: On 2010-03-24 16:15, Markus Jelsma wrote: A bit off-topic but how about Nutch grabbing some conent and have it indexed in Solr? The problem is not with collecting and submitting the documents, the problem is with parsing the Wikimedia

Re: update some index documents after indexing process is done with DIH

2010-03-24 Thread ANKITBHATNAGAR
: If you make your EventListener implements SolrCoreAware you can get : hold of the core on inform. use that to get hold of the : SolrIndexWriter Implementing SolrCoreAware I can get hold of the core and easy get hold of A SolrIndexSearcher and so a reader. But I can't see the way to get hold

Re: If you could have one feature in Solr...

2010-03-24 Thread Teruhiko Kurosaka
First of all, I am not really concerned with per field (or per-column in DB term) portion of the original request. Most documents are monolingual. How languages are identified depends on your application, and database support of language tagging is not necessary. The database schema designer may

Re: SOLR-1316 How To Implement this autosuggest component ???

2010-03-24 Thread Alexey Serba
You should add this component (suggest or spellcheck, depends how do you name it) to request handler, i.e. add requestHandler name=/suggest class=org.apache.solr.handler.component.SearchHandler lst name=defaults /lst arr name=components strsuggest/str /arr

RE: lowercasing for sorting

2010-03-24 Thread Binkley, Peter
Thanks - wouldn't want to get you into trouble! It's handy when selling the idea of using Solr in the Canadian academic world to be able to drop names like the Globe and Mail, though. If I do I'll keep my source confidential. Peter -Original Message- From: Nagelberg, Kallin

Re: Encoding problem with ExtractRequestHandler for HTML indexing

2010-03-24 Thread Teruhiko Kurosaka
I suppose you mean Extract_ing_RequestHandler. Out of curiosity, I sent in a Japanese HTML file of EUC-JP encoding, and it converted to Unicode properly and the index has correct Japanese words. Does your HTML files have META tag for Content-type with the value having charset= ? For example,

multicore embedded swap / reload etc.

2010-03-24 Thread Nagelberg, Kallin
Hi, I've got a situation where I need to reindex a core once a day. To do this I was thinking of having two cores, one 'live' and one 'staging'. The app is always serving 'live', but when the daily index happens it goes into 'staging', then staging is swapped into 'live'. I can see how to do

Seattle Hadoop/Scalability/NoSQL Meetup Wednesday, March 31st. w/ LinkedIn's Jake Mannix

2010-03-24 Thread Bradford Stephens
Greetings, Don't forget that the Hadoop/Scalability/NoSQL meetup is next Wednesday, March 31st at 6:45pm! We're going to have a very exciting guest: Jake Mannix from LinkedIn will talk about machine learning on Hadoop. He's a well-decorated engineer across many disciplines, and even knows quite a

Apache Lucene EuroCon Call For Participation: Prague, Czech Republic May 20 21, 2010

2010-03-24 Thread Grant Ingersoll
Apache Lucene EuroCon Call For Participation - Prague, Czech Republic May 20 21, 2010 All submissions must be received by Tuesday, April 13, 2010, 12 Midnight CET/6 PM US EDT The first European conference dedicated to Lucene and Solr is coming to Prague from May 18-21, 2010. Apache Lucene

Fwd: Apache Lucene EuroCon Call For Participation: Prague, Czech Republic May 20 21, 2010

2010-03-24 Thread Yonik Seeley
Forwarding to solr only - the big cross-post caused my gmail filters to file it. -Yonik -- Forwarded message -- From: Grant Ingersoll gsing...@apache.org Date: Wed, Mar 24, 2010 at 8:03 PM Subject: Apache Lucene EuroCon Call For Participation: Prague, Czech Republic May 20 21,

Re: Field Collapsing SOLR-236

2010-03-24 Thread Dennis Gearon
Boy, I hope that field collapsing works! I'm planning on using it heavily. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 3/24/10, blargy zman...@hotmail.com