Re: update some index documents after indexing process is done with DIH

2009-07-29 Thread Chris Hostetter
This thread all sounds really kludgy ... among other things the newSearcher listener is going to need to some how keep track of when it was called as a result of a "real" commit, vs when it was called as the result of a commit it itself triggered to make changes. wouldn't an easier place to im

Re: solr indexing on same set of records with different value of unique field...not working...

2009-07-29 Thread Chris Hostetter
I'm not really understanding how you could get the situation you describe ... which suggests that one (or both) of us don't understand exactly what happened. if you can post the actual schema.xml file you used and an example of the input you indexed perhaps we can spot the discrepency. FWIW:

Re: DocList Pagination

2009-07-29 Thread Chris Hostetter
: Hi, I am try to get the next DocList "page" in my custom search component. : Could I get a code example of this? you just increase the "offset" value you pass to SolrIndexSearcher.getDocList by whatever your page size is. (if you use the newer QueryCommand versions you just call setOffset wi

Re: issue inquiry: unterminated index lock after optimize update command

2009-07-29 Thread Chris Hostetter
: I'm using solr build 2009-06-16_08-06-14, in multicore configuration. : When I issue the update command "optimize" to a core, the index files : are locked and never released. Calling the coreAdmin unload method on : the core unload the core but does not unlock the underlying index files. : The

Re: update some index documents after indexing process is done with DIH

2009-07-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
If you make your EventListener implements SolrCoreAware you can get hold of the core on inform. use that to get hold of the SolrIndexWriter On Wed, Jul 29, 2009 at 9:20 PM, Marc Sturlese wrote: > > From the newSearcher(..) of a CustomEventListener which extends of > AbstractSolrEventListener  can

Re: Is there a multi-shard optimize message?

2009-07-29 Thread Chris Hostetter
: > Normally to optimize an index you POST to /solr/update. Is : > there any way to POST an optimize message to one instance and have it : > propagate to all shards sort of like the select? : > : > /solr-shard-1/select?q=dog... shards=shard-1,shard2 : No, you'll need to send optimize to each ho

RE: Boosting ('bq') on multi-valued fields

2009-07-29 Thread KaktuChakarabati
Hey Ken, Thanks for your reply. When I wrote '5|6' I ment that this is a multiValued field with two values '5' and '6', rather than the literal string '5|6' (and any Tokenizer). Does your reply still holds? That is, are multiValued fields dependent on the notion of tokenization to such a degree so

Re: deleteById always returning OK

2009-07-29 Thread Koji Sekiguchi
Reuben Firmin wrote: Is it expected behaviour that "deleteById" will always return OK as a status, regardless of whether the id was matched? It is expected behaviour as Solr always returns 0 unless an error occurs during processing a request (query, update, ...), so you don't need to check t

Re: Wildcard and boosting

2009-07-29 Thread Jón Helgi Jónsson
I just updated to nightly build (I was using 1.2) and this does not seem to be an issue anymore. 2009/7/29 Jón Helgi Jónsson : > Hey now! > > I do index time boosting for my fields and just discovered that when > searching with a trailing wild card the boosting is ignored. > > Will my boosting wor

Re: THIS WEEK: PNW Hadoop, HBase / Apache Cloud Stack Users' Meeting, Wed Jul 29th, Seattle

2009-07-29 Thread Bradford Stephens
Don't forget this is tonight! Excited to see everyone there. On Tue, Jul 28, 2009 at 11:25 AM, Bradford Stephens wrote: > Hey everyone, > > SLIGHT change of plans. > > A few people have asked me to move to a place with Air Conditioning, > since the temperature's in the 90's this week. So, here we

Re: search suggest

2009-07-29 Thread Jason Rutherglen
I created an issue and have added some notes https://issues.apache.org/jira/browse/SOLR-1316 On Wed, Jul 29, 2009 at 3:15 PM, Jason Rutherglen wrote: > Here's a good article on Ternary Trees: http://www.ddj.com/windows/184410528 > > I looked at the one in Lucene, I don't understand why the find me

deleteById always returning OK

2009-07-29 Thread Reuben Firmin
Is it expected behaviour that "deleteById" will always return OK as a status, regardless of whether the id was matched? I have a unit test: // set up the test data engine.index(12345, s1, d1); engine.index(54321, s2, d2); engine.index(23453, s3, d3); // ... @Test public void t

Re: Indexing TIKA extracted text. Are there some issues?

2009-07-29 Thread ashokc
Could very well be... I will rectify it and try again. Thanks - ashok Robert Muir wrote: > > it appears there is an encoding problem, in the screenshot I can see > the title is mangled, and if i open up the URL in IE or firefox, both > browsers think it is iso-8859-1. > > I think this is why

Re: Indexing TIKA extracted text. Are there some issues?

2009-07-29 Thread Robert Muir
it appears there is an encoding problem, in the screenshot I can see the title is mangled, and if i open up the URL in IE or firefox, both browsers think it is iso-8859-1. I think this is why (from w3c validator): Character Encoding mismatch! The character encoding specified in the HTTP header (

Re: search suggest

2009-07-29 Thread Jason Rutherglen
Here's a good article on Ternary Trees: http://www.ddj.com/windows/184410528 I looked at the one in Lucene, I don't understand why the find method only returns a char/int? On Wed, Jul 29, 2009 at 2:33 PM, Robert Petersen wrote: > Simple minded autosuggest can just not tokenize the phrases at all

Re: Indexing TIKA extracted text. Are there some issues?

2009-07-29 Thread ashokc
Sure. The java command I use with TIKA to extract text from a URL is: java -jar tika-0.3-standalone.jar -t $url I have also attached the screenshots of the web page, post documents produced in the two different ways (Perl & Tika) for that web page, and the screenshots of the search result for a

RE: search suggest

2009-07-29 Thread Robert Petersen
Simple minded autosuggest can just not tokenize the phrases at all and so the wildcards just complete whatever the user has typed so far including spaces. Upon encountering a space though, autosuggest should wait to make more suggestions until the user has typed at least a couple of letters of the

Re: search suggest

2009-07-29 Thread manuel aldana
also watch out that you have a good stopwords list otherwise the suggestions won't be helpful for the user. Jack Bates wrote: how can i use solr to make search suggestions? i'm thinking google-style suggestions, which suggests more refined queries - vs. freebase-style suggestions, which suggest

RE: query and analyzers

2009-07-29 Thread Harsch, Timothy J. (ARC-SC)[PEROT SYSTEMS]
That did it, thanks! I thought that was how it should work, but I guess somehow I got out of sync or something at one point which led me to dive deeper into it than I needed to. -Original Message- From: AHMET ARSLAN [mailto:iori...@yahoo.com] Sent: Wednesday, July 29, 2009 12:52 PM To:

RE: query and analyzers

2009-07-29 Thread AHMET ARSLAN
In order to match (query) XYZ1* to (document) XYZ123 you do not need WordDelimiterFilterFactory. You need an tokenizer that recognizes XYZ123 as one token. And WhitespaceTokenizer is one of them. As I see from the fieldType named text_ws, you want to use WhitespaceTokenizerFactory and there is

RE: query and analyzers

2009-07-29 Thread Harsch, Timothy J. (ARC-SC)[PEROT SYSTEMS]
This was the definition I was last working with (I've been playing with setting the various parameters). -Original Message- From: AHMET ARSLAN [mailto:iori...@yahoo.com] Sent: Wednesday,

Visualizing Semantic Journal Space (large scale) using full-text

2009-07-29 Thread Glen Newton
I thought the Lucene and Solr communities would find this interesting: My collaborators and I have used LuSql, Lucene and Semantic Vectors to visualize a large scale semantic journal space (kind of like 'Maps of Science') of a large scale (5.7 million articles) journal article collection using only

Re: search suggest

2009-07-29 Thread Jason Rutherglen
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/compound/hyphenation/TernaryTree.html On Wed, Jul 29, 2009 at 12:08 PM, Jason Rutherglen wrote: > Autosuggest is something that would be very useful to build into > Solr as many search projects require it. > > I'd recommend indexin

Re: search suggest

2009-07-29 Thread Jason Rutherglen
Autosuggest is something that would be very useful to build into Solr as many search projects require it. I'd recommend indexing relevant terms/phrases into a Ternary Search Tree which is compact and performant. Using a wildcard query will likely not be as fast as a Ternary Tree, and I'm not sure

Re: query in solr lucene

2009-07-29 Thread Avlesh Singh
You may index your data using a delimiter, like $my-field-content$. While searching, perform a phrase query with the leading and trailing "$" appended to the query string. Cheers Avlesh On Wed, Jul 29, 2009 at 12:04 PM, Sushan Rungta wrote: > I tried using AND, but it even provided me doc 3 whi

Re: query and analyzers

2009-07-29 Thread AHMET ARSLAN
> What analyzer, tokenizer, filter factory would I need to > use to get wildcard matching to match where: > Value: > XYZ123 > Query: > XYZ1* StandardAnalyzer, WhitespaceAnalyzer. > I have been messing with solr.WordDelimiterFilterFactory > splitOnNumerics and oreserveOriginal in both the analyz

query and analyzers

2009-07-29 Thread Harsch, Timothy J. (ARC-SC)[PEROT SYSTEMS]
Hi, What analyzer, tokenizer, filter factory would I need to use to get wildcard matching to match where: Value: XYZ123 Query: XYZ1* I have been messing with solr.WordDelimiterFilterFactory splitOnNumerics and oreserveOriginal in both the analyzer and the query. I also noticed it is different

Multi select faceting

2009-07-29 Thread Mike
Hi, We're using Lucid Imagination's LucidWorks Solr 1.3 and we have a requirement to implement multiple-select faceting where the facet cells show up as checkboxes and despite checked options, all of the options continue to persist with counts. The best example I found is the search on Lucid Im

Re: Getting Tika to work in Solr 1.4 nightly

2009-07-29 Thread Yonik Seeley
Hi Kevin, The parameter names have changed in the latest Solr 1.4 builds... please see http://wiki.apache.org/solr/ExtractingRequestHandler -Yonik http://www.lucidimagination.com On Wed, Jul 29, 2009 at 10:17 AM, Kevin Miller wrote: > I am working with Solr 1.4 nightly and am running it on a Wi

RE: refering/alias other Solr documents

2009-07-29 Thread Steven A Rowe
Hi Ravi, This may help: http://wiki.apache.org/solr/HierarchicalFaceting Steve > -Original Message- > From: ravi.gidwani [mailto:ravi.gidw...@gmail.com] > Sent: Wednesday, July 29, 2009 3:24 AM > To: solr-user@lucene.apache.org > Subject: refering/alias other Solr documents > > > H

Wildcard and boosting

2009-07-29 Thread Jón Helgi Jónsson
Hey now! I do index time boosting for my fields and just discovered that when searching with a trailing wild card the boosting is ignored. Will my boosting work with a wild card if I do it at query time? And if so is there a lot of performance difference? Some other method I can use to preserve

RE: search suggest

2009-07-29 Thread Robert Petersen
To do a proper search suggest feature you have to index all the queries your system gets and search it with wildcards for matches on what the user has typed so far for each user keystroke in the search box... Usually with some timer logic to wait for a small hesitation in their typing. -O

Re: update some index documents after indexing process is done with DIH

2009-07-29 Thread Marc Sturlese
>From the newSearcher(..) of a CustomEventListener which extends of AbstractSolrEventListener can access to SolrIndexSearcher and all core properties but can't get a SolrIndexWriter. Do you now how can I get from there a SolrIndexWriter? This way I would be able to modify the documents (I need to

RE: Boosting ('bq') on multi-valued fields

2009-07-29 Thread Ensdorf Ken
> Hey, > I have a field defined as such: > > stored="false" > multiValued="true" /> > > with the string type defined as: > > omitNorms="true"/> > > When I try using some query-time boost parameters using the bq on > values of > this field it seems to behave > strangely in case of documents actua

Re: Relevant results with DisMaxRequestHandler

2009-07-29 Thread Erik Hatcher
On Jul 29, 2009, at 6:55 AM, Vincent Pérès wrote: Using the following query : http://localhost:8983/solr/others/select/?debugQuery=true&q=anna%20lewis&rows=20&start=0&fl=*&qt=dismax I get back around 100 results. Follow the two first : Person:151 Victoria Davisson Person:37 Anna Lewis And

Re: FieldCollapsing: Two response elements returned?

2009-07-29 Thread Licinio Fernández Maurelo
My last mail is wrong. Sorry El 29 de julio de 2009 11:10, Licinio Fernández Maurelo escribió: > I've applied latest collapse field related patch (patch-3) and it doesn't > work. > Anyone knows how can i get only the collapse response ? > > > 29-jul-2009 11:05:21 org.apache.solr.common.SolrExcept

Getting Tika to work in Solr 1.4 nightly

2009-07-29 Thread Kevin Miller
I am working with Solr 1.4 nightly and am running it on a Windows machine. Solr is running using the example folder that was installed from the zip file. The only alteration that I have made to this default installation is to add a simple Word document into the exampledocs folder. I am trying to

Question about formatting the results returned from Solr

2009-07-29 Thread ahammad
Hi all, Not sure how good my title is, but here is a (hopefully) better explanation on what I mean. I am indexing a set of articles from a DB. Each article has an author. The author is saved in then the DB as an author ID, which is a number. There is another table in the DB with more relevant i

Re: facet.prefix question

2009-07-29 Thread Koji Sekiguchi
Licinio Fernández Maurelo wrote: i'm trying to do some filtering in the count list retrieved by solr when doing a faceting query , i'm wondering how can i use facet.prefix to gem something like this: Query facet.field=foo&facet.prefix=A OR B Response - 12560 5440 2357 . . . How

Relevant results with DisMaxRequestHandler

2009-07-29 Thread Vincent Pérès
Hello, I did notice several strange behaviors on queries. I would like to share with you an example, so maybe you can explain to me what is going wrong. Using the following query : http://localhost:8983/solr/others/select/?debugQuery=true&q=anna%20lewis&rows=20&start=0&fl=*&qt=dismax I get back

Re: solr/home in web.xml relative to web server home

2009-07-29 Thread Shalin Shekhar Mangar
On Wed, Jul 29, 2009 at 2:42 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > Hi all, > > the environment variable (env-entry) in web.xml to configure the solr/home > is relative to the web server's working directory. I find this unusual as > all the servlet paths are relative to

Re: HTTP Status 500 - java.lang.RuntimeException: Can't find resource 'solrconfig.xml'

2009-07-29 Thread Koji Sekiguchi
As Solr said in the log, Solr couldn't find solrconfig.xml in classpath or solr.solr.home, cwd. My guess is that relative path you set for solr.solr.home was incorrect? Why don't you try: solr.solr.home=/home/huenzhao/search/tomcat6/bin/solr instead of: solr.solr.home=home/huenzhao/search/tomc

Re: debugQuery=true issue

2009-07-29 Thread gwk
Hi, Thanks for your response, I'm still developing so the schema is still in flux so I guess that explains it. Oh and regarding the NPE, I updated my checkout and recompiled and now it's gone so I guess somewhere between revision 787997 and 798482 it's already been fixed. Regards, gwk Robe

Re: highlighting performance

2009-07-29 Thread Koji Sekiguchi
Just an FYI, Lucene 2.9 has FastVectorHighlighter: http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apache/lucene/search/vectorhighlight/package-summary.html Features * fast for large docs * support N-gram fields * support phrase-unit highlighting with slops *

solr/home in web.xml relative to web server home

2009-07-29 Thread Chantal Ackermann
Hi all, the environment variable (env-entry) in web.xml to configure the solr/home is relative to the web server's working directory. I find this unusual as all the servlet paths are relative to the web applications directory (webapp context, that is). So, I specified solr/home relative to th

Re: FieldCollapsing: Two response elements returned?

2009-07-29 Thread Licinio Fernández Maurelo
I've applied latest collapse field related patch (patch-3) and it doesn't work. Anyone knows how can i get only the collapse response ? 29-jul-2009 11:05:21 org.apache.solr.common.SolrException log GRAVE: java.lang.ClassCastException: org.apache.solr.handler.component.CollapseComponent cannot be

Re: update some index documents after indexing process is done with DIH

2009-07-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese wrote: > > That really sounds the best way to reach my goal. How could I invoque a > listener from the newSearcher?Would be something like: >     >       >         solr 0 name="rows">10 >         rocks 0 name="rows">10 >        static newSearcher w

Boosting ('bq') on multi-valued fields

2009-07-29 Thread KaktuChakarabati
Hey, I have a field defined as such: with the string type defined as: When I try using some query-time boost parameters using the bq on values of this field it seems to behave strangely in case of documents actually having multiple values: If i'd do a boost for a particular value ( "site_id

refering/alias other Solr documents

2009-07-29 Thread ravi.gidwani
Hi all: Is in solr, that will allow documents referring each other ? In other words, if a search for "abc" matches on document 1 , I should be able to return document 2 even though the index does any fields matching "abc". Here is the scenario with some more details: Solr version:1.3 Sce

Re: Is there a multi-shard optimize message?

2009-07-29 Thread Shalin Shekhar Mangar
On Wed, Jul 29, 2009 at 2:48 AM, Phillip Farber wrote: > > Normally to optimize an index you POST to /solr/update. Is > there any way to POST an optimize message to one instance and have it > propagate to all shards sort of like the select? > > /solr-shard-1/select?q=dog... shards=shard-1,shard