how to recover when indexing with proxy & shards

2010-12-08 Thread patrick
hi, i'm considering of using more than 3 solr shards and assign a (separate) proxy to do the loadbalancing when indexing. using SolrJ is my way to do the indexing. the question is if i get any information about the whereabouts of the shard in which the document is stored. this information wou

Re: Search based on images

2010-12-08 Thread Maciej Lisiewski
There is imgSeek ( http://www.imgseek.net/isk-daemon ), which while being far from perfect (can't handle rotated images) is quite simple and has already been added to xapian. Paper on the method used: http://grail.cs.washington.edu/projects/query/mrquery.pdf -- Maciej Lisiewski I know some

Re: Search based on images

2010-12-08 Thread 朱炎詹
I know some people use OpenCV to include the function into search service. I also wish to see this functionality in future Solr. - Original Message - From: "sivaprasad" To: Sent: Thursday, December 09, 2010 1:57 PM Subject: Search based on images Hi, If i upload a product image

Problem with loading a class

2010-12-08 Thread Maciej Lisiewski
I am trying to use StempelPolishStemFilter: I've added to analyzer in fieldtype in schema.xml, restarted Jetty, and got: org.apache.solr.common.SolrException: Error loading class 'solr.StempelPolishStemFilterFactory' [...] Caused by: java.lang.ClassNotFoundException: solr.StempelPolishStemFi

Search based on images

2010-12-08 Thread sivaprasad
Hi, If i upload a product image, i need to find the similar images based on the uploaded image.The sample is given below. http://www.gazopa.com/similar?img=eb04%3A9601%2F1543805&img_url=http%3A%2F%2Fd.yimg.com%2Fi%2Fng%2Fsp%2Fp4%2F20090208%2F20%2F350396776.jpg# Anybody has any ideas on this.Plea

Re: Solr Image Result

2010-12-08 Thread ali fathieh
I think it was my bad to be vague and I should be more clear about this I have some MS-Word files in the format of .doc which may have some text with style , like being bold ,or have some  images. I want to use solr to find pages that have the searched word , then highlight searched words in thos

Re: [Casting] values on update/csv

2010-12-08 Thread Chris Hostetter
: I am using curl to run the following and as soon as I convert the field type : from string to tdouble, I get the errors you see below. : : 0:0:0:0:0:0:0:1 - - [08/12/2010:23:28:27 +] "GET : /solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,popula

Re: Warming searchers/Caching

2010-12-08 Thread Chris Hostetter
: What am I doing that Solr already provides? the one thing i haven't seen mentioned anywhere in this thread is what you have the "autoWarmCount" value set to on all of the various solr internal caches (as seen in your solrconfig.xml) if that's set, you don't need to manually feed solr any spe

Re: [Casting] values on update/csv

2010-12-08 Thread Adam Estrada
Hi, I am using curl to run the following and as soon as I convert the field type from string to tdouble, I get the errors you see below. 0:0:0:0:0:0:0:1 - - [08/12/2010:23:28:27 +] "GET /solr/update/csv?commit=true&separator=%2C&fieldnames=id,name,asciiname,lat,lng,countrycode,population,el

Re: Delete by query or Id very slow

2010-12-08 Thread Tom Hill
That''s a pretty low number of documents for auto complete. It means that when getting to 850,000 documents, you will create 8500 segments, and that's not counting merges. How big are your documents? I just created an 850,000 document (and a 3.5 m doc index) with tiny documents (id and title), and

Re: Changing a solr schema from non-stored to stored on the fly

2010-12-08 Thread Yonik Seeley
On Wed, Dec 8, 2010 at 6:07 PM, Kaktu Chakarabati wrote: > Can I do this? i.e change that value in schema, and then incrementally > re-index documents to populate it? > would that work? what would be returned if at all for documents that werent > re-indexed post-schema change? Yes, this should wo

Changing a solr schema from non-stored to stored on the fly

2010-12-08 Thread Kaktu Chakarabati
Can I do this? i.e change that value in schema, and then incrementally re-index documents to populate it? would that work? what would be returned if at all for documents that werent re-indexed post-schema change? Thanks, Chak -- View this message in context: http://lucene.472066.n3.nabble.com/C

Re: How to handle multivalued hierarchical facets?

2010-12-08 Thread Yonik Seeley
Hoss had a great webinar on faceting that also covered how you could do hierarchical. http://www.lucidimagination.com/solutions/webcasts/faceting See "taxonomy facets", about 28 minutes in. -Yonik http://www.lucidimagination.com On Wed, Dec 8, 2010 at 5:28 PM, Andy wrote: > I have facets that ar

How to handle multivalued hierarchical facets?

2010-12-08 Thread Andy
I have facets that are hierarchical. For example, Location can be represented as this hierarchy: Country > State > City If each document can only have a single value for each of these facets, then I can just use separate fields for each facet. But if multiple values are allowed, then that appr

Re: Warming searchers/Caching

2010-12-08 Thread Erick Erickson
Yep. firstSearcher is when the instance is started, newSearcher for after replication. On Wed, Dec 8, 2010 at 1:31 PM, Mark wrote: > I actually built in the before/after hooks so we can disable/enable a node > from the cluster while its replicating. When the machine was copying over > 20gigs and

Re: [Casting] values on update/csv

2010-12-08 Thread Markus Jelsma
Should be no problem but please paste the log output etc. > All, > > I have a csv file and I want to store one of the fields as a tdouble type. > It does not like that at all...Is there a way to cast the string value to a > tdouble? > > Thanks, > Adam

[Casting] values on update/csv

2010-12-08 Thread Adam Estrada
All, I have a csv file and I want to store one of the fields as a tdouble type. It does not like that at all...Is there a way to cast the string value to a tdouble? Thanks, Adam

Re: Map size must not be negative with spatial results + php serialized

2010-12-08 Thread Chris Hostetter
: That's fine - it could be a Solr bug too. it definitely looks like a generic solr bug. JSONResponseWriter.java:398 (in the writeSolrDocument method that supports psuedo-fields) writeMapOpener(-1); // no trivial way to determine map size PHPSerializedResponseWriter.java:221 (in which PHP

Delete by query or Id very slow

2010-12-08 Thread Ravi Kiran
Hello, Iam using solr 1.4.1 when I delete by query or Id from solrj it is very very slow almost like a hang. The core form which Iam deleting has close to 850K documents in the index. In the solrconfig.xml autocommit is set as follows. Any idea how to speed up the deletion process. Pl

Re: Syncing 'delta-import' with 'select' query

2010-12-08 Thread Juan Manuel Alvarez
Hello everyone! I have been doing some tests, but it seems I can't make the synchronize flag work. I have made two tests: 1) DIH with commit=false 2) DIH with commit=false + commit via Solr XML update protocol And here are the log results: For (1) the command is "/solr/dataimport?command=delta-im

Webcast: Better Search Results Faster with Apache Solr and LucidWorks Enterprise

2010-12-08 Thread Yonik Seeley
We're holding a free webinar about relevancy enhancements in our commercial version of Solr. Details below. -Yonik http://www.lucidimagination.com - Join us for a free technical webcast "Better Search Results Faster with Apa

Re: Return Lucene DocId in Solr Results

2010-12-08 Thread Chris Hostetter
: Subject: Re: Return Lucene DocId in Solr Results : : Ahhh, you're already down in Lucene. That makes things easier... : : See TermDocs. Particularly seek(Term). That'll directly access the indexed : unique key rather than having to form a bunch of queries. you should also sort your "keys" lexi

Re: Warming searchers/Caching

2010-12-08 Thread Mark
I actually built in the before/after hooks so we can disable/enable a node from the cluster while its replicating. When the machine was copying over 20gigs and serving requests the load spiked tremendously. It was easy enough to create a sort of rolling replication... ie, 1) node 1 removes hea

Re: How badly does NTFS file fragmentation impact search performance? 1.1X? 10X? 100X?

2010-12-08 Thread Peter Sturge
There are, as you would expect, a lot of factors that impact the amount of fragmentation that occurs: commit rate, mergeFactor updates/deletes vs 'new' data etc. Having run reasonably large indexes on NTFS (>25GB), we've not found fragmentation to be much of a hindrance. I don't have any definitiv

Re: How badly does NTFS file fragmentation impact search performance? 1.1X? 10X? 100X?

2010-12-08 Thread Péter Király
Hi Will, I could not answer you exact numbers, but yes, defragmentation in Windows is important, and it will speed up searches. I guess, that the ratio is determined by the number of file fragments. In Win environment I regularly run defragmentation, and usually I use drives for Lucene/Solr index

Re: How badly does NTFS file fragmentation impact search performance? 1.1X? 10X? 100X?

2010-12-08 Thread Tom Hill
If you can benchmark before and after, please post the results when you are done! Things like your index's size, and the amount of RAM in your computer will help make it meaningful. If all of your index can be cached, I don't think fragmentation is going matter much, once you get warmed up. Tom

Re: Warming searchers/Caching

2010-12-08 Thread Erick Erickson
Perhaps the tricky part here is that Solr makes it's caches for #parts# of the query. In other words, a query that sorts on field A will populate the cache for field A. Any other query that sorts on field A will use the same cache. So you really need just enough queries to populate, in this case, t

How badly does NTFS file fragmentation impact search performance? 1.1X? 10X? 100X?

2010-12-08 Thread Will Milspec
Hi all, Pardon if this isn't the best place to post this email...maybe it belongs on the lucene-user list . Also, it's basically windows-specific,so not of use to everyone... The question: does NTFS fragmentation affect search performance "a little bit" or "a lot"? It's obvious that "fragmentat

Re: Taxonomy and Faceting

2010-12-08 Thread Tommaso Teofili
Thanks Markus for helping with that, there are some changes in the configuration that need to be done. However I've just submitted a new patch at [1] which fix jar packaging and holds a README.txt which contains the following, it's very simple : 1. copy generated solr-uima jar and its libs (und

Re: Warming searchers/Caching

2010-12-08 Thread Mark
We only replicate twice an hour so we are far from real-time indexing. Our application never writes to master rather we just pick up all changes using updated_at timestamps when delta-importing using DIH. We don't have any warming queries in firstSearcher/newSearcher event listeners. My initia

Re: Warming searchers/Caching

2010-12-08 Thread Mark
I am using 1.4.1. What am I doing that Solr already provides? Thanks for you help On 12/8/10 5:10 AM, Erick Erickson wrote: What version of Solr are you using? Because it seems like you're doing a lot of stuff that Solr already does for you automatically So perhaps a more complete stateme

Re: Taxonomy and Faceting

2010-12-08 Thread Markus Jelsma
Don't know if its useful but from the old thread: http://code.google.com/p/solr-uima/wiki/5MinutesTutorial On Wednesday 08 December 2010 16:18:06 webdev1977 wrote: > Any luck with a tutorial? :-) -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Solr Image Result

2010-12-08 Thread Markus Jelsma
Well, you might abuse HTML's possibility to embed binary data in tags using base64 encoding. If you manage to extract the binary data and replace it in the tags and store it in Solr, you could output valid HTML including images. But, i might be totally misunderstanding the question here ;) O

Re: Taxonomy and Faceting

2010-12-08 Thread webdev1977
Any luck with a tutorial? :-) -- View this message in context: http://lucene.472066.n3.nabble.com/Taxonomy-and-Faceting-tp2028442p2040246.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Image Result

2010-12-08 Thread Stefan Matheis
I'm pretty sure ali is talking about a result like that: http://books.google.de/books?id=xxBZZ5YS06kC&pg=PA165&dq=apache+solr&hl=de&ei=HKD_TI-vN4yWswbmtvDyDg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CDMQ6AEwAA#v=onepage&q=apache%20solr&f=false

Re: Map size must not be negative with spatial results + php serialized

2010-12-08 Thread Yonik Seeley
On Wed, Dec 8, 2010 at 9:45 AM, Markus Jelsma wrote: > I know, but since it's an Apache component throwing the exception, i'd figure > someone just might know more about this. That's fine - it could be a Solr bug too. IMO, solr-user traffic just needs to be solr related and hopefully useful to ot

Re: Map size must not be negative with spatial results + php serialized

2010-12-08 Thread Markus Jelsma
I know, but since it's an Apache component throwing the exception, i'd figure someone just might know more about this. An, the guys do visit the list afaik. Anyway, i'd ask there too. Thanks On Wednesday 08 December 2010 15:41:21 Grant Ingersoll wrote: > That sounds like a JTeam plugin problem,

Re: Solr Image Result

2010-12-08 Thread Grant Ingersoll
Can you clarify a bit more what you mean? Are you saying you want to generate an image (like a JPG) from the stored fields? On Dec 8, 2010, at 5:04 AM, ali fathieh wrote: > Hello guys, > To be quick, Do you know what is the best solution to get image of Solr's > result? To achieve something si

Re: Map size must not be negative with spatial results + php serialized

2010-12-08 Thread Grant Ingersoll
That sounds like a JTeam plugin problem, which is not supported here. On Dec 8, 2010, at 5:38 AM, Markus Jelsma wrote: > Hi, > > Got another issue here. This time it's the PHP serialized response writer > throwing the following exception only when spatial parameters are set using > LocalParam

Re: Severe NoClassDefFoundError Spell StringDistance Nightly 20101207

2010-12-08 Thread Grant Ingersoll
You might try cleaning out the example/work directory and also do an "ant clean example" On Dec 7, 2010, at 7:54 AM, Dan Hertz (Insight 49, LLC) wrote: > Whilst running java -jar start.jar from the latest nightly build example > directory, I get the following...any ideas how to fix this? Thank

RE: Warming searchers/Caching

2010-12-08 Thread Jonathan Rochkind
How often do you replicate? Do you know how long your warming queries take to complete? As others in this thread have mentioned, if your replications (or ordinary commits, if you weren't using replication) happen quicker than warming takes to complete, you can get overlapping indexes being wa

Re: Terms component with shards?

2010-12-08 Thread Shawn Heisey
Interesting. Everything I had read said that it didn't work. Maybe it's SolrJ that doesn't support it in 1.4. The wiki says it requires 1.5 or later. Shawn On 12/7/2010 11:02 PM, bbarani wrote: Hey Shawn, Thanks for your reply. I tried using shards and shards qt parameter, its working l

Multicore and Replication (scripts vs. java, spellchecker)

2010-12-08 Thread Martin Grotzke
Hi, we're just planning to move from our replicated single index setup to a replicated setup with multiple cores. We're going to start with 2 cores, but the number of cores may change/increase over time. Our replication is still based on scripts/rsync, and I'm wondering if it's worth moving to ja

Map size must not be negative with spatial results + php serialized

2010-12-08 Thread Markus Jelsma
Hi, Got another issue here. This time it's the PHP serialized response writer throwing the following exception only when spatial parameters are set using LocalParams in Solr 1.4.1 using JTeam's plugin: Map size must not be negative java.lang.IllegalArgumentException: Map size must not be negat

How to update solr index.

2010-12-08 Thread bachpv
Data posted to Solr server and indexed, but after that, data has some change and how to it reflect to solr index? Thank you! -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-update-solr-index-tp2038480p2038480.html Sent from the Solr - User mailing list archive at Nabbl

Re: How to update solr index.

2010-12-08 Thread bachpv
Thanks you, and i have one more question, - How to customize solr source and embed functions about data minning? - And what is the best crawler use for Solr server. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-update-solr-index-tp2038480p2038825.html Sent from the S

Re: Warming searchers/Caching

2010-12-08 Thread Erick Erickson
What version of Solr are you using? Because it seems like you're doing a lot of stuff that Solr already does for you automatically So perhaps a more complete statement of your setup is in order, since we seem to be talking past each other. Best Erick On Tue, Dec 7, 2010 at 10:24 PM, Mark wr

Solr Image Result

2010-12-08 Thread ali fathieh
Hello guys, To be quick, Do you know what is the best solution to get image of Solr's result? To achieve something similar to google books? As my documents are in Farsi Language, and are stored as MS-Word Doc file, I decided to use solr as search platform and make Image from its result. I'd be th

Re: solrj & http client 4

2010-12-08 Thread Stevo Slavić
OK, thanks. Can't promise anything, but would love to contribute. First impression on the source code - ant is used as build tool, wish it was maven. If it was maven then https://issues.apache.org/jira/browse/SOLR-1218would be trivial or wouldn't exist in the first place. Regards, Stevo. On Wed,

Lucene UK contract developers

2010-12-08 Thread Andrew Stopford
Hi, Apologies for making this my first email to the list but I am seeking a UK Lucene contractor to join my team here in Manchester for a possible short term contract around Q1 2011. The mix of skills that I am most interested in is experince with working with Solr, Nutch, Hadoop and a real bonus

Re: Query performance very slow even after autowarming

2010-12-08 Thread johnnyisrael
Alexey, 1) I am using EdgeNGramFilter only in index analyzer alone. 2) Sometimes even multiple characters also creating problems, What i told was an example for a performance problem, And moreover I am trying out the autowarming and sometime it is not working ideally [100% guaranteed performan

Re: Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Anurag
Thanks a lot!! If I want to index "query terms from lof files" ? Is it possible . And then want to do autosuggest query on all those terms using termsComponentTill now my autosuggest options are like "q.prefix= or q.suffix= " which matches the terms available in the documents. - Kumar

Re: Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Tanguy Moal
Kind of : their suggestions are based on users queries with some filtering. You can have a little read there : http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=106230 They perform "little" filtering to remove offending content such as "hate speech, violence and pornography" (quot

Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Anurag
How Google selects the autosuggest terms? Is that Google uses "Userrs Queries" from Log files to suggest only those terms? - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039078.html Sent from the Solr - User

Re: solrj & http client 4

2010-12-08 Thread Chantal Ackermann
SOLR-2020 addresses upgrading to HttpComponents (form HttpClient). I have had no time to work more on it, yet, though. I also don't have that much experience with the new version, so any help is much appreciated. Cheers, Chantal On Tue, 2010-12-07 at 18:35 +0100, Yonik Seeley wrote: > On Tue, Dec

Re: How to update solr index.

2010-12-08 Thread Grijesh.singh
You have to create again solr xml doc from modified data with same unique id and post it to solr You can not change any specific field of any document - Grijesh -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-update-solr-index-tp2038480p2038801.html Sent from the