Re: Fast Vector Highlighter Working for some records only

2012-02-23 Thread dhaivat
Hi Koji i am using solr 3.5 and i want to highlight the multivalued field, when i supply single value for the multi field value at that highlighter is working fine. but when i am indexing multiple values for field and try to highlight that field at that time i am getting following error with Fast

Re: How to increase Size of Document in solr

2012-02-23 Thread bing
Hi, Suneel, There is a configuration in solrconfig.xml that you might need to look at. Following I set the limit as 2GB. Best Regards, Bing -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-increase-Size-of-Document-in-solr-tp3771813p3771931.html Sent from t

How to increase Size of Document in solr

2012-02-23 Thread Suneel
Hello friends, I am facing a problem during indexing of solr. Indexing successfully working when data size 300 mb but now my data size have increased its around 50 GB when i caching data its taking 8 hours and after that I found that data have not committed i have tried 2 time but same issue occ

TikaLanguageIdentifierUpdateProcessorFactory(since Solr3.5.0) to be used in Solr3.3.0?

2012-02-23 Thread bing
Hi, all, I am using org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory (since Solr3.5.0) to do language detection, and it's cool. An issue: if I deploy Solr3.3.0, is it possible to import that factory in Solr3.5.0 to be used in Solr3.3.0? Why I stick on Solr3.3.0 is

Re: how to ignore cases while querying with a field with type="string"?

2012-02-23 Thread Erick Erickson
I think your best bet is to NOT use string, use something like: wrote: > hi all, > > I am storing a list of tags in a field using type="string" with multiValued > setting: > > multiValued="true"/> > > It works ok, when I query wit

Re: Date search by specific month and day

2012-02-23 Thread Erick Erickson
I think your best bet is to parse out the relevant units and index them independently. But this is probably only a few ints per record, so it shouldn't be much of a resource hog Best Erick On Thu, Feb 23, 2012 at 5:24 PM, Kurt Nordstrom wrote: > Hello all! > > We have a situation involving d

Re: Solr Performance Improvement and degradation Help

2012-02-23 Thread Erick Erickson
It's still worth looking at the GC characteristics, there's a possibility that the newer build uses memory such that you're tripping over some threshold, but that's grasping at straws. I'd at least hook up jConsole for a sanity check... But if your QTimes are fast, the next thing that comes to min

Re: undefined field on CSV db import

2012-02-23 Thread Erick Erickson
What does your schema.xml file look like? Is Product_ID defined as a field? Best Erick On Thu, Feb 23, 2012 at 1:24 PM, pmcgovern wrote: > I am trying to import a csv file of values via curl (PHP) and am receiving an > 'undefined field' error, but I am not sure why, as I am defining the field. >

how to ignore cases while querying with a field with type="string"?

2012-02-23 Thread Yuhan Zhang
hi all, I am storing a list of tags in a field using type="string" with multiValued setting: It works ok, when I query with pageKeyword:"The ones". and when I search for "ones" no record will come up as desired. However, it appears that the query is case sensitive. so the query pageKeyword:"T

Preferred file system for Solr

2012-02-23 Thread Mou
We are using a VeloDrive (SSD) to store and search our solr index. The system is running on SLES 11. Right now we are using ext3 but wondering if anyone has any experience using XFS/ext3 on SSD or FusionIO for Solr . Does solr have any preference for the underlined file system ? Our index will b

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Naomi Dushay
Ticket created: https://issues.apache.org/jira/browse/SOLR-3158 (perhaps it's a lucene problem, not a Solr one -- feel free to move it or whatever.) - Naomi On Feb 23, 2012, at 11:55 AM, Robert Muir [via Lucene] wrote: > Please make a new one if you dont mind! > > On Thu, Feb 23, 2012 at 2

Date search by specific month and day

2012-02-23 Thread Kurt Nordstrom
Hello all! We have a situation involving date searching that I could use some seasoned opinions on. What we have is a collection of records, each containing a Solr date field by which we want search on. The catch is that we want to be able to search for items that match a specific day/month.

Re: need to support bi-directional synonyms

2012-02-23 Thread Jonathan Rochkind
Honestly, I'd just map em both the same thing in the index. sprayer, washer => sprayer or sprayer, washer => sprayer_washer At both index and query time. Now if the source document includes either 'sprayer' or 'washer', it'll get indexed as 'sprayer_washer'. And if the user enters either 's

Backporting Wildcard fieldlist Features to 3.x versions

2012-02-23 Thread naptowndev
We are currently running tests against some of the more recent nightly builds of Solr 4, but have noticed some significant performance decreases recently. Some of the reasons we are using Solr 4 is because we needed geofiltering and highlighting which were not originally available in 3 from my und

Re: Solr Performance Improvement and degradation Help

2012-02-23 Thread naptowndev
Erick - Thanks. We've actually worked with Sematext to optimize the GC settings and saw initial (and continued) performance boosts as a result... The situation we're seeing now, has both versions of Solr running on the same box under the same JVM, but we are undeploying an instance at a time so

RE: autoGeneratePhraseQueries sort of silently set to false

2012-02-23 Thread Burton-West, Tom
Thanks Erik, The 3.1 changes document the ability to set this and the default being set to "true" However apparently the change between 3.4 and 3.5 the default was set to "false" Since this will change the behavior of any field where autoGeneratePhraseQueries is not explicitly set, it could e

Re: DataImportHandler running out of memory

2012-02-23 Thread Shawn Heisey
On 2/20/2012 6:49 AM, v_shan wrote: DIH still running out of memory for me, with Full Import on a database of size 1.5 GB. Solr version: 3_5_0 Note that I have already added batchSize="-1" but getting same error. A few questions: - How much memory have you given to the JVM running this Solr

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Robert Muir
Please make a new one if you dont mind! On Thu, Feb 23, 2012 at 2:45 PM, Naomi Dushay wrote: > Robert - > > Did you mean for me to attach my docs to an existing ticket (which one?) or > just want to make sure I attach the docs to the new issue? > > - Naomi > > On Feb 23, 2012, at 11:39 AM, Rober

Re: autoGeneratePhraseQueries sort of silently set to false

2012-02-23 Thread Erik Hatcher
there's this (for 3.1, but in the 3.x CHANGES.txt): * SOLR-2015: Add a boolean attribute autoGeneratePhraseQueries to TextField. autoGeneratePhraseQueries="true" (the default) causes the query parser to generate phrase queries if multiple tokens are generated from a single non-quoted analysi

RE: autoGeneratePhraseQueries sort of silently set to false

2012-02-23 Thread Burton-West, Tom
Seems like a change in default behavior like this should be included in the changes.txt for Solr 3.5. Not sure how to do that. Tom -Original Message- From: Naomi Dushay [mailto:ndus...@stanford.edu] Sent: Thursday, February 23, 2012 1:57 PM To: solr-user@lucene.apache.org Subject: autoG

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Naomi Dushay
Robert - Did you mean for me to attach my docs to an existing ticket (which one?) or just want to make sure I attach the docs to the new issue? - Naomi On Feb 23, 2012, at 11:39 AM, Robert Muir [via Lucene] wrote: > Please attach your docs if you dont mind. > > I worked up tests for this (in

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-23 Thread Bing Li
Dear Mr Gupta, Your understanding about my solution is correct. Now both HBase and Solr are used in my system. I hope it could work. Thanks so much for your reply! Best regards, Bing On Fri, Feb 24, 2012 at 3:30 AM, T Vinod Gupta wrote: > regarding your question on hbase support for high perfo

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Robert Muir
Please attach your docs if you dont mind. I worked up tests for this (in general for ANY phrase query, increasing the slop should never remove results, only potentially enlarge them). It fails already... but its good to also have your test case too... On Thu, Feb 23, 2012 at 2:20 PM, Naomi Dusha

Re: Solr & HBase - Re: How is Data Indexed in HBase?

2012-02-23 Thread T Vinod Gupta
regarding your question on hbase support for high performance and consistency - i would say hbase is highly scalable and performant. how it does what it does can be understood by reading relevant chapters around architecture and design in the hbase book. with regards to ranking, i see your problem

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Naomi Dushay
Robert, I will create a jira issue with the documentation. FYI, I tried ps values of 3, 2, 1 and 0 and none of them worked with dismax; For lucene QueryParser, only the value of 0 got results. - Naomi On Feb 23, 2012, at 11:12 AM, Robert Muir [via Lucene] wrote: > Is it possible to also p

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Robert Muir
Is it possible to also provide your document? If you could attach the document and the analysis config and queries to a JIRA issue, that would be most ideal. On Thu, Feb 23, 2012 at 2:05 PM, Naomi Dushay wrote: > Robert, > > You found it!   it is the phrase slop.  What do I do now?   I am using S

Re: result present in Solr 1.4, but missing in Solr 3.5, dismax only

2012-02-23 Thread Naomi Dushay
Robert, You found it! it is the phrase slop. What do I do now? I am using Solr from trunk from December, and all those JIRA tixes are marked fixed … - Naomi Solr 1.4: luceneQueryParser: URL: q=all_search:"The Beatles as musicians : Revolver through the Anthology"~3 final query: all_sea

Re: Multiple Property Substitution

2012-02-23 Thread entdeveloper
*bump* I'm also curious is something like this is possible. Being able to nest property substitution variables, especially when using multiple cores, would be a really slick feature. Zach Friedland wrote > > Has anyone found a way to have multiple properties (override & default)? > What > I'd

autoGeneratePhraseQueries sort of silently set to false

2012-02-23 Thread Naomi Dushay
Another thing I noticed when upgrading from Solr 1.4 to Solr 3.5 had to do with results when there were hyphenated words: aaa-bbb. Erik Hatcher pointed me to the autoGeneratePhraseQueries attribute now available on fieldtype definitions in schema.xml. This is a great feature, and everything

undefined field on CSV db import

2012-02-23 Thread pmcgovern
I am trying to import a csv file of values via curl (PHP) and am receiving an 'undefined field' error, but I am not sure why, as I am defining the field. Can someone lend some insight as to what I am missing / doing wrong? Thank you in advance. Sample of CSV File: --- "Product_ID"

Re: Probleme with unicode query

2012-02-23 Thread Em
Hi Frederic, I saw similar issues when sending such a request without proper URL-encoding. It is important to note that the URL-encoded string already has to be an UTF-8-string. What happens if you send that query via Solr's admin-panel? Have a look at this page for troubleshooting: http://wiki.a

Re: Unique key constraint and optimistic locking (versioning)

2012-02-23 Thread Em
Hi Per, > I want an error to occur if a document with the same id already > exists, when my intent is to INSERT a new document. When my intent is > to UPDATE a document in solr/lucene I want the old document already > in solr/lucene deleted and the new version of this document added > (exactly as

probleme with unicode query

2012-02-23 Thread Frederic Bouchery
hello, I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode quey. Here is my text field configuration When I performe this request : select/?q=hygiene sécurité&debugQuery=true Here is debug infos : hygiene sécurité hygiene sécurité searchText:hygien (sea

Probleme with unicode query

2012-02-23 Thread Frederic Bouchery
hello, I'm using Solr 3.5 over Tomcat 6 and I've some problemes with unicode quey. Here is my text field configuration When I performe this request : select/?q=hygiene sécurité&debugQuery=true Here is debug infos : hygiene sécurité hygiene sécurité searchText:hygien (sea

Re: Unique key constraint and optimistic locking (versioning)

2012-02-23 Thread Erick Erickson
Per: Yep, you've got it. You could write a custom update handler that queried (via TermDocs or something) for the ID when your intent was to INSERT, but it'll have to be custom work. I suppose you could query with a divide-and-conquer approach, that is query for id:(1 2 58 90... all your insert ID

Re: How to retrieve tokens?

2012-02-23 Thread Erick Erickson
Essentially, you're talking about reconstructing the field from the tokens, and that's pretty difficult in general and lossy. For instance, if you use stemming and "running" gets stemmed to "run", you get back just "run" from the index. Is that acceptable? But otherwise, you've got to go into the

Re: Solr Performance Improvement and degradation Help

2012-02-23 Thread Erick Erickson
Ah, no, my mistake. The wildcards for the fl list won't matter re: maxBooleanClauses, I didn't read carefully enough. I assume that just returning a field or two doesn't slow down But one possible culprit, especially since you say this kicks in after a while, is garbage collection. Here's an

Re: String search in Dismax handler

2012-02-23 Thread Erick Erickson
OK, I really don't get this. The quoted bit gives: +DisjunctionMaxQuery((xid:pass by value^0.3 | id:pass by value^0.3 | x_name:"pass ? value"^0.3 | text:"pass ? value" | name:"pass ? value"^2.3)) The bare bit gives: +((DisjunctionMaxQuery((uxid:pass^0.3 | id:pass^0.3 | x_name:pass^0.3 | text:loan

RE: Trunk build errors

2012-02-23 Thread Steven A Rowe
Hi Darren, I use Ant 1.7.1. There have been some efforts to make the build work with Ant 1.8.X, but it is not (yet) the required version. So if you're not using Ant 1.7.1, I suggest you try it. Steve > -Original Message- > From: dar...@ontrenet.com [mailto:dar...@ontrenet.com] > Sent

Re: Unique key constraint and optimistic locking (versioning)

2012-02-23 Thread Per Steffensen
Em skrev: Hi Per, well, Solr has no "Update"-Method like a RDBMS. It is a re-insert of the whole document. Therefore a document with an existing UniqueKey marks the old document as deleted and inserts the new one. Yes I understand. But it is not always what I want to acheive. I want an error

How to retrieve tokens?

2012-02-23 Thread Thiago
Hi to everybody, My name is Thiago and I'm new with Apache Solr and NoSQL databases. At the moment, I'm working and using Solr for document indexing. My Question is: Is there any way to retrieve the tokens in place of the original data? For example: I have a field using the fieldtype text_general

Re: Solr Performance Improvement and degradation Help

2012-02-23 Thread naptowndev
Erick - Agreed, it is puzzling. What I've found is that it doesn't matter if I pass in wildcards for the field list or not...but that the overall response time from the newer builds of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older (4.0.0.2010.12.10.08.54.56) build. If

Re: Unique key constraint and optimistic locking (versioning)

2012-02-23 Thread Em
Hi Per, well, Solr has no "Update"-Method like a RDBMS. It is a re-insert of the whole document. Therefore a document with an existing UniqueKey marks the old document as deleted and inserts the new one. However this is not the whole story, since this "constraint" only works per index/SolrCore/Sha

Re: Unique key constraint and optimistic locking (versioning)

2012-02-23 Thread Per Steffensen
Em skrev: Hi Per, Solr provides the so called "UniqueKey"-field. Refer to the Wiki to learn more: http://wiki.apache.org/solr/UniqueKey Belive the uniqueKey does not enforce a "unique key constraint", so that you are not allowed to create a document with an id's when an document with the sa

Re: Trunk build errors

2012-02-23 Thread darren
I updated yesterday and did an ant clean, ant test. I will try a clean pull next. I'm on linux. Perhaps an ant version issue? > There was recently some work done to get better about checking > on licenses, when did you last get trunk? About 9 days ago was > the last go-round. > > And did you do

Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Hi Chantal, Yes, I have thought about the docfreq(field_name,'search_text') function, but somehow I will have dereference the article id's (AID) from the result of the query to the sort. The below query does not work: q=AT:metal&sort=docfreq(AREFS,$q.AID) Is there a mistake in the query that am

Re: Trunk build errors

2012-02-23 Thread Erick Erickson
There was recently some work done to get better about checking on licenses, when did you last get trunk? About 9 days ago was the last go-round. And did you do an 'ant clean'? It works on my machine with a fresh pull this morning. Best Erick On Wed, Feb 22, 2012 at 5:27 PM, Darren Govoni wrote

Re: Same id on two shards

2012-02-23 Thread Erick Erickson
I really think you'll be in a world of hurt if you have the same ID on different shards. I just wouldn't go there. The statement "may be non-deterministic" should be taken to mean that this is just unsupported. Why is this the case? What is the use-case for putting the same ID on different shard?

Re: Solr Performance Improvement and degradation Help

2012-02-23 Thread Erick Erickson
It's pretty hard to say, even with the data you've provided. But, try adding &debugQuery=on and look particularly down near the bottom there'll be a "" section. That section lists the time taken by all the components of a search, not just the QTime. Things like highlighting etc. that can often give

Re: How is Data Indexed in HBase?

2012-02-23 Thread Erick Erickson
I suspect you'd get better answers on the HBase in terms of how data is indexed. I suspect that the answer to which you should use depends on what kinds of searching you're doing, although this seems like an apples-and-oranges question, they're intended for different problems. Best Erick On Wed,

Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Chantal Ackermann
Sorry to have misunderstood. It seems the new Relevance Functions in Solr 4.0 might help - unless you need to use an official release. http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions On Thu, 2012-02-23 at 13:04 +0100, rks_lucene wrote: > Dear Chantal, > > Thanks for your reply, b

Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Lee Carroll
Have you looked at external fields? http://lucidworks.lucidimagination.com/display/solr/Solr+Field+Types#SolrFieldTypes-WorkingwithExternalFiles you will need a process to do the counts and note the limitation of updates only after a commit, but i think it would fit your usecase. On 23 Febru

Re: Range Query with sensitive Scoring

2012-02-23 Thread Ahmet Arslan
> I have an Integer field which carries a value between 0 to > 18. > > Ist there a way to query this field fuzzy? For example > search for field:5 > and also match documents near it (like documents containing > field:4 oder > field:6)? > > And if this is possible, is it also possible to boost exa

Re: SnapPull failed :org.apache.solr.common.SolrException: Error opening new searcher

2012-02-23 Thread eks dev
it loos like it works, with patch, after a couple of hours of testing under same conditions didn't see it happen (without it, approx. every 15 minutes). I do not think it will happen again with this patch. Thanks again and my respect to your debugging capacity, my bug report was really thin. On

Range Query with sensitive Scoring

2012-02-23 Thread Hannes Carl Meyer
Hello, I have an Integer field which carries a value between 0 to 18. Ist there a way to query this field fuzzy? For example search for field:5 and also match documents near it (like documents containing field:4 oder field:6)? And if this is possible, is it also possible to boost exact matches a

Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Dear Chantal, Thanks for your reply, but thats not what I was asking. Let me explain. The size of the list in AREFS would give me how many records are *referred by* an article and NOT how many records *refer to* an article. Say if an article id - 51463 has been published in 2002 and refers to 10

Re: String search in Dismax handler

2012-02-23 Thread mechravi25
HI Erick, Thanks for the response. I am currently using solr 1.5 version. We are getting the following query when we give the search query as "Pass By Value" without quotes and by using qt=dismax in the request query. webapp=/solr path=/select/ params={facet=true&f.typeFacet.facet.mincount=1&

Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Chantal Ackermann
Hi Ritesh, you could add another field that contains the size of the list in the AREFS field. This way you'd simply sort by that field in descending order. Should you update AREFS dynamically, you'd have to update the field with the size, as well, of course. Chantal On Thu, 2012-02-23 at 11:27

Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Hi, I have a journal article citation schema like this: { AT - article_title AID - article_id (Unique id) AREFS - article_references_list (List of article id's referred/cited in this article. Multi-valued) AA - Article Abstract --- other_article_stuff ... } So for example, in o

Re: solr 3.5 and indexing performance

2012-02-23 Thread mizayah
Ok i found it. Its becouse of Hunspell which now is in solr. Somehow when im using it by myself in 3.4 it is a lot of faster then one from 3.5. Dont know about differences, but is there any way i use my old Google Hunspell jar? -- View this message in context: http://lucene.472066.n3.nabble.com

Re: How to merge an "autofacet" with a predefined facet

2012-02-23 Thread Xavier
Thank you for theses informations, I'll keep that in mind. But i'm sorry, i don't get it about the process to do it ??? Em wrote > > Well, you could create a keyword-file out of your database and join it > with your self-maintained keywordslist. > By that you mean : - 'self-maintained keyw

Re: 'location' fieldType indexation impossible

2012-02-23 Thread Xavier
You totally get it :) I'v deleted thoses dynamicField (though it was just an exemple), why didn't i read the comment above the line ! Thanks alot ;) Best regards, Xavier. -- View this message in context: http://lucene.472066.n3.nabble.com/location-fieldType-indexation-impossible-tp3766136

Re: Fast Vector Highlighter Working for some records only

2012-02-23 Thread dhaivat
Hi Koji, Thanks for your guidance. i have looked into anlysis page of solr and it's working fine.but still it's not working fine for few documents. here is configuration for highlighter i am using,i have specefied this in solrconfig.xml, please can you tell me what should i change to highlighter