Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Dennis Gearon
Ahh, LOL! I wouldn't have thought about that unless I were fixing the issues that you guys have worked on. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have t

Question related to phrase search in lucene/solr?

2010-10-13 Thread Ahson Iqbal
hi all I have question is it possible to perform a phrase search with wild cards in solr/lucene as if i have two queries both have exactly same results one is +Contents:"change market" and other is +Contents:"chnage* market" but i think the second should match "chages market" as well but i

Re: searching while importing

2010-10-13 Thread Shawn Heisey
If I haven't deleted the index for some reason before doing the full import, then I can search the old data. On 10/13/2010 4:41 PM, Tri Nguyen wrote: Hi, As long as I can search on the current ("older") index while importing, I'm good. I've tested this and I can search the older index while

Re: Upgrade to Solr 1.4, very slow at start up when loading all cores

2010-10-13 Thread Renee Sun
just update on this issue... we turned off the new/first searchers (upgrade to Solr 1.4.1), and ran benchmark tests, there is no noticeable performance impact on the queries we perform comparing with Solr 1.3 benchmark tests WITH new/first searchers. Also the memory usage reduced by 5.5 GB after

Re: searching while importing

2010-10-13 Thread 朱炎詹
You can build 2 similiar Solr cores. One for service & one for importing. When the importing is done, you can do either MERGE or SWAP actions, depending on how you put your data on these 2 cores. - Original Message - From: "Tri Nguyen" To: Sent: Thursday, October 14, 2010 5:51 AM S

Re: What is the maximum number of documents that can be indexed ?

2010-10-13 Thread 朱炎詹
Solr is designed in scalable aritechture. So you question depends on how many resources (cpu, memory, space, etc.) you have to scale Solr how high (within a single machine), how wide (how fast a request you wish to response to user using replication), and how deep (how many slices/partition (Sol

Re: What is the maximum number of documents that can be indexed ?

2010-10-13 Thread Otis Gospodnetic
Marco (use solr-u...@lucene list to follow up, please), There are no precise answers to such questions. Solr can keep indexing. The limit is, I think, the available disk space. I've never pushed Solr or Lucene to the point where Lucene index segments would become a serious pain, but even tha

Re: searching while importing

2010-10-13 Thread Ken Stanley
On Wed, Oct 13, 2010 at 6:38 PM, Shawn Heisey wrote: > If you are using the DataImportHandler, you will not be able to search new > data until the full-import or delta-import is complete and the update is > committed. When I do a full reindex, it takes about 5 hours, and until it > is finished,

Re: Prioritizing adjectives in solr search

2010-10-13 Thread Erick Erickson
Spans do care about the order of words, so that might help Erick On Tue, Oct 12, 2010 at 11:23 PM, Ron Mayer wrote: > Erick Erickson wrote: > > You can do some interesting things with payloads. You could index a > > particular value as the payload that identified the "kind" of word it > was

Re: searching while importing

2010-10-13 Thread Shawn Heisey
If I haven't deleted the index for some reason before doing the full import, then I can search the old data. On 10/13/2010 4:41 PM, Tri Nguyen wrote: Hi, As long as I can search on the current ("older") index while importing, I'm good. I've tested this and I can search the older index while

Re: searching while importing

2010-10-13 Thread Tri Nguyen
Hi, As long as I can search on the current ("older") index while importing, I'm good.  I've tested this and I can search the older index while data-importing the newer index. So you can search the older index in your 5 hour wait? Thanks, Tri From: Shawn He

Re: searching while importing

2010-10-13 Thread Shawn Heisey
If you are using the DataImportHandler, you will not be able to search new data until the full-import or delta-import is complete and the update is committed. When I do a full reindex, it takes about 5 hours, and until it is finished, I cannot search it. I have not tried to issue a manual co

Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Jan Høydahl / Cominvent
You don't know what documents to bring up summaries for before you have merged and sorted the docIds from all shards. And you don't want to waste resources by fetching it all. Example: Phase 1 request: q=foo bar&rows=10&sort=price asc&shards=node1:8983,node2:8983,node3:8983 Phase 1 response: The

searching while importing

2010-10-13 Thread Tri Nguyen
Hi,   Can I perform searches against the index while it is being imported?   Does importing add 1 document at a time or will solr make a temporary index and switch to that index when indexing is done?   Thanks,   Tri

I am amazed at what you guys have done!!!!!!!!

2010-10-13 Thread Igor Chudov
I have just implemented (well, almost) a Solr solution for a "More Like This" type of application. Let me just say that I am BLOWN AWAY by how great of a job the Solr team has done. It works! It works fast! It works VERY well! i

Re: using score to find high confidence duplicates

2010-10-13 Thread Matt Mitchell
No this isn't the MLT, just the standard query parser for now. I did try the heuristic approach and I might stick with that actually. I ran the process on known duplicates and created a collection of all scores. I was then able to see how well the query worked. The scores seemed focused to one rang

Re: DataImportHandler dynamic fields clarification

2010-10-13 Thread Alexey Serba
Harry, could you please file a jira for this and I'll address this in a patch. I fixed related issue (SOLR-2102) and I think it's pretty similar. > Interesting, I was under the impression that case does not matter. > > From http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config : > "I

Re: Deletes writing bytes len 0, corrupting the index

2010-10-13 Thread Jason Rutherglen
Thanks Robert, that Jira issue aptly describes what I'm seeing, I think. On Wed, Oct 13, 2010 at 10:22 AM, Robert Muir wrote: > if you are going to fill up your disk space all the time with solr > 1.4.1, I suggest replacing the lucene jars with lucene jars from > 2.9-branch (http://svn.apache.org

Re: Deletes writing bytes len 0, corrupting the index

2010-10-13 Thread Jason Rutherglen
There's a corrupt index exception thrown when opening the searcher. The rest of the files of the segment are OK. Meaning the problem has occurred in writing the bit vector well after the segment has been written. I'm guessing we're simply not verifying that the BV has been written fully/properly,

Re: Searching Across Multiple Cores

2010-10-13 Thread Ken Stanley
On Wed, Oct 13, 2010 at 2:11 PM, Lohrenz, Steven wrote: > Hi, > > I am trying to figure out if how I can accomplish the following: > > I have a fairly static and large set of resources I need to have indexed > and searchable. Solr seems to be a perfect fit for that. In addition I need > to have th

Re: using score to find high confidence duplicates

2010-10-13 Thread Peter Karich
Hi, are you using moreLikeThis for that feature? I have no suggestion for a reliable threshold, I think this depends on the domain you are operating and is IMO only solvable with a heuristic. It also depends on fields, boosts, ... It could be that there is a 'score gap' between duplicates and none

Re: Searching Across Multiple Cores

2010-10-13 Thread Tim AtLee
On 10/13/10, Lohrenz, Steven wrote: > Hi, > > I am trying to figure out if how I can accomplish the following: > > I have a fairly static and large set of resources I need to have indexed and > searchable. Solr seems to be a perfect fit for that. In addition I need to > have the ability for my use

Searching Across Multiple Cores

2010-10-13 Thread Lohrenz, Steven
Hi, I am trying to figure out if how I can accomplish the following: I have a fairly static and large set of resources I need to have indexed and searchable. Solr seems to be a perfect fit for that. In addition I need to have the ability for my users to add resources from the main data set to a

Re: How do I get the solr error response as XML instead of HTML

2010-10-13 Thread Chris Hostetter
: solr errors come back as HTML instead of XM or JSON : : Is it possible to get the response to come back as XML or JSON, or at : least something I could show to an end user? At the moment, Solr just relies on the Servlet Container to generate the error response, so you'd have to customize it at

Re: Deletes writing bytes len 0, corrupting the index

2010-10-13 Thread Robert Muir
if you are going to fill up your disk space all the time with solr 1.4.1, I suggest replacing the lucene jars with lucene jars from 2.9-branch (http://svn.apache.org/repos/asf/lucene/java/branches/lucene_2_9/). then you get the fix for https://issues.apache.org/jira/browse/LUCENE-2593 too. On Wed

Re: multicore defaultCoreName not working

2010-10-13 Thread Ron Chan
that explains it then, using 1.4.1 thanks for that Ron - Original Message - From: "Ephraim Ofir" To: solr-user@lucene.apache.org Sent: Wednesday, 13 October, 2010 2:11:49 PM Subject: RE: multicore defaultCoreName not working Which version of solr are you using? I believe this

Re: Deletes writing bytes len 0, corrupting the index

2010-10-13 Thread Michael McCandless
I'm not certain whether we test this particular case, but we do have several disk full tests. But: are you seeing a corrupt index? Ie, exception on open or on searching or on CheckIndex? Or: do you see a disk-full exception when writing the del file, during indexing, that does not in fact corrup

Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Dennis Gearon
I think that's good thinking. I wonder, do the two phases have to be invoked externally by two queries, or why couldn't it be all self contained in each instance behind the load leveler? Just curious how it works. Dennis Gearon Signature Warning It is always a good idea to lea

Re: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Renee Sun
Ken, looks like we posted at same time :-) thanks very much! Renee -- View this message in context: http://lucene.472066.n3.nabble.com/using-HTTPClient-sending-solr-ping-request-wont-timeout-as-specified-tp1691292p1695584.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Renee Sun
thanks Michael, I got it resolved last night... you are right, it is more like a HttpClient issue after I tried another link unrelated to solr. If anyone is interested, here is the working code: HttpClientParams httpClientParams = new HttpClientParams(); httpClientParams.setSoTim

Re: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Ken Krugler
Hi Renee, Mike is right, this is a question to post on the HttpClient users list (httpclient-us...@hc.apache.org). And yes, there is a separate setConnectionTimeout() that can be used. Though I'm most familiar with HttpClient 4.0, not 3.1. One possibility is that the ping response handler

Deletes writing bytes len 0, corrupting the index

2010-10-13 Thread Jason Rutherglen
We have unit tests for running out of disk space? However we have Tomcat logs that fill up quickly and starve Solr 1.4.1 of space. The main segments are probably not corrupted, however routinely now, there are deletes files of length 0. 0 2010-10-12 18:35 _cc_8.del Which is fundamental index co

Re: AW: Installation Solr 1.4 + Tika

2010-10-13 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andreas Jung wrote: >ough the command line using: > > java -jar -Xms512M -Xmx1024M -Dsolr.solr.home=solr start.jar > solr.home seems to refer to a directory containing the solr.xml file. I am using basically an out-of-the-box configuration and can no

Re: Spatial search in Solr 1.5

2010-10-13 Thread PeterKerk
ArggghhhI was working in OLD data-config...it now works! :) Thanks, this is a GREAT addition. I do like to know when the final implementation of this feature is implemented (as I understood it might change in the final release). Which issue can I subscribe to, to be informed? Thanks again!!

Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Shawn Heisey
On 10/13/2010 6:46 AM, Yonik Seeley wrote: A related point - the load balancing implementation that's part of SolrCloud (and looks like it will be committed to trunk soon), does keep track of what server it used for the first phase and uses that for subsequent phases. Are the cloud bits likel

RE: Error loading class 'solr.ASCIIFoldingFilterFactory'

2010-10-13 Thread Sethi, Parampreet
Thanks Hoss for the reply. Yes, The ASCIIFoldingFilterFactory factory does not exist in Solr 1.3. Its present in lucene-core-2.9.0-sources.jar which is part of Solr 1.4. I found out after checking the java docs for this file. -param -Original Message- From: Chris Hostetter [mailto:hoss

Re: AW: Installation Solr 1.4 + Tika

2010-10-13 Thread Andreas Jung
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 markus.rietz...@rzf.fin-nrw.de wrote: > in the standard solr distribution its unter ./contrib/extraction/lib > solr 1.4.1 comes with a tika included. you want to have a newer version of > tika, right? > > ok... my lib directory now contains: suxma

Re: dynamic "stop" words?

2010-10-13 Thread Matt Mitchell
Great, thanks Hoss. I'll try dismax out today and see what happens with this. Matt On Tue, Oct 12, 2010 at 7:35 PM, Chris Hostetter wrote: > > : Is it possible to have certain query terms not effect score, if that > : same query term is present in a field? For example, I have an index of > > tha

Re: Spatial search in Solr 1.5

2010-10-13 Thread Yonik Seeley
On Wed, Oct 13, 2010 at 10:06 AM, PeterKerk wrote: > > haha ;) > > But so I DO have the right solr version? > > Anyways...I have added the lines you mentioned, what else can I do? The fact that the geolocation field does not show up in the results means that it's not getting added (i.e. something

Re: Spatial search in Solr 1.5

2010-10-13 Thread PeterKerk
haha ;) But so I DO have the right solr version? Anyways...I have added the lines you mentioned, what else can I do? Thanks again! -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-search-in-Solr-1-5-tp489948p1694683.html Sent from the Solr - User mailing list archiv

Re: Spatial search in Solr 1.5

2010-10-13 Thread Yonik Seeley
On Wed, Oct 13, 2010 at 9:42 AM, PeterKerk wrote: > Im now thinking I downloaded the wrong solr zip, I tried this one: > https://hudson.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/apache-solr-4.0-2010-10-12_08-05-48.zip > > In that example scheme > (\apache-solr-4

Re: Spatial search in Solr 1.5

2010-10-13 Thread PeterKerk
Im now thinking I downloaded the wrong solr zip, I tried this one: https://hudson.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/trunk/solr/dist/apache-solr-4.0-2010-10-12_08-05-48.zip In that example scheme (\apache-solr-4.0-2010-10-12_08-05-48\example\example-DIH\solr\db\conf\sch

using score to find high confidence duplicates

2010-10-13 Thread Matt Mitchell
I have a solr index full of documents that contain lots of duplicates. The duplicates are not exact duplicates though. Each may vary slightly in content. After indexing, I have a bit of code that loops through the entire index just to get what I'm calling "target" documents. For each target docume

RE: multicore defaultCoreName not working

2010-10-13 Thread Ephraim Ofir
Which version of solr are you using? I believe this is only available on trunk, not even on 1.4.1 (SOLR-1722). Also, watch out for SOLR-2127 bug, haven't gotten around to creating a patch yet... Ephraim Ofir -Original Message- From: Ron Chan [mailto:rc...@i-tao.com] Sent: Wednesday,

Re: Spatial search in Solr 1.5

2010-10-13 Thread darren
Does the spatial constraints for laton types work for multivalued latlon fields? Is there an example of it? using a field conjunction with > < operators didn't work, last I checked. > On Wed, Oct 13, 2010 at 7:28 AM, PeterKerk wrote: >> Hi, >> >> Thanks for the quick reply :) >> >> I downloaded t

Re: Spatial search in Solr 1.5

2010-10-13 Thread Yonik Seeley
On Wed, Oct 13, 2010 at 7:28 AM, PeterKerk wrote: > Hi, > > Thanks for the quick reply :) > > I downloaded the latest version from the trunk. Got it up and running, and > got the error below: Hopefully the QuickStart on the wiki all worked for you, but you only got the error when customizing your

Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Yonik Seeley
On Tue, Oct 12, 2010 at 12:11 PM, Jan Høydahl / Cominvent wrote: > I'm pretty sure the 2nd phase to fetch doc-summaries goes directly to same > server as first phase. But what if you stick a LB in between? A related point - the load balancing implementation that's part of SolrCloud (and looks li

RE: using HTTPClient sending solr ping request wont timeout as specified

2010-10-13 Thread Michael Sokolov
This does seem more like an HTTPClient question than a solr question - you might get more traction on their lists? Still, from what I remember HTTPClient has a number of timeouts you can set. Perhaps it's the read timeout you need? -Mike > -Original Message- > From: Renee Sun [mailto:r

Re: Spatial search in Solr 1.5

2010-10-13 Thread PeterKerk
Hi, Thanks for the quick reply :) I downloaded the latest version from the trunk. Got it up and running, and got the error below: URL: http://localhost:8983/solr/db/select/?wt=xml&indent=on&facet=true&fl=id,title,lat,lng,city&facet.field=province_raw&q=*:*&fq={!geofilt%20pt=45.15,-93.85%20sfiel

Re: Spellcheck issues in 3.1

2010-10-13 Thread Markus Jelsma
Nice, that's the trick to remember. On Wednesday, October 13, 2010 12:16:07 pm Robert Muir wrote: > > SEVERE: java.lang.NoSuchMethodError: > > org.apache.lucene.analysis.standard.StandardFilter.(Lorg/apache/luc > > ene/util/Version;Lorg/apache/lucene/analysis/TokenStream;)V -- Markus Jelsma - CT

Re: Spellcheck issues in 3.1

2010-10-13 Thread Robert Muir
you need to clean and recompile > SEVERE: java.lang.NoSuchMethodError: > org.apache.lucene.analysis.standard.StandardFilter.(Lorg/apache/lucene/util/Version;Lorg/apache/lucene/analysis/TokenStream;)V

Spellcheck issues in 3.1

2010-10-13 Thread Markus Jelsma
Hi, Something funny is going on in the current 3.1 branch which is bothering me at least since a couple of days. My solrconfig hasn't really changed (perhaps some request handler diffs) and my schema hasn't changed for a long time. For some unknown reason, indexing or searching with spellcheck

Re: Index time boosting is not working with boosting value in document level

2010-10-13 Thread Shanmugavel SRD
Thanks Iorixxx... Boosting is working while using DIH to import XML. Thanks, SRD -- View this message in context: http://lucene.472066.n3.nabble.com/Index-time-boosting-is-not-working-with-boosting-value-in-document-level-tp1649072p1693423.html Sent from the Solr - User mailing list archive at

Re: LuceneRevolution - NoSQL: A comparison

2010-10-13 Thread Péter Király
2010/10/12 Peter Keegan : > I listened with great interest to Grant's presentation of the NoSQL > comparisons/alternatives to Solr/Lucene. My question: will this presentation be available somewhere? I do not find any presentation material nn the conference web site. Király Péter http://eXtensible

Re: About setting solrconfig.xml

2010-10-13 Thread Peter Karich
Hi Jason, > Hi, all. > I got some question about solrconfig.xml. > I have 10 fields in a document for index. > (Suppose that field names are f1, f2, ... , f10.) > Some user will want to search in field f1 and f5. > Another user will want to search in field f2, f3 and f7. > > I am going to use dism

Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?

2010-10-13 Thread Stefan Matheis
On Tue, Oct 12, 2010 at 6:29 PM, Israel Ekpo wrote: > I think this feature will take care of this. > > What do you think? sounds good!

multicore defaultCoreName not working

2010-10-13 Thread Ron Chan
Hello I have this in my solr.xml admin is working and the individual cores are working through http://localhost:8080/solr/live/select/?q=abc and http://localhost:8080/solr/staging/select/?q=abc returning the correct results from the right core however, I wanted to keep the

AW: Installation Solr 1.4 + Tika

2010-10-13 Thread Markus.Rietzler
in the standard solr distribution its unter ./contrib/extraction/lib solr 1.4.1 comes with a tika included. you want to have a newer version of tika, right? > -Ursprüngliche Nachricht- > Von: Andreas Jung [mailto:li...@zopyx.com] > Gesendet: Dienstag, 12. Oktober 2010 19:14 > An: solr-u

Re: NPE for a MLT query on a missing doc due to null facet_counts in solrj

2010-10-13 Thread Peter Karich
> Should I create a JIRA ticket? already there: https://issues.apache.org/jira/browse/SOLR-2005 we should provide a patch though ... Regards, Peter. > With solrj doing a more like this query for a missing document: > /mlt?q=docId:SomeMissingId > always throws a null pointer exception: > Ca