Re: Solr4.2 - Fuzzy Search Problems

2013-05-21 Thread Chris Hostetter
: : 2) although I set editing distance to 1 in my query (e.g. worde~1), solr : returns me results having 2 editing distance (like WORDOES, WORHEE, WORKEE, : .. ect. ) fuzzy search works on *terms* in your index -- if you use a stemme when you index your data (your schema shows that you are)

Re: Deleting an entry from a collection when they key has : in it

2013-05-20 Thread Chris Hostetter
: Technically, core Solr does not require a unique key. A lot of features in nohting in this thread refered to the uniqueKey field, or the lack of a uniqueKey field in the users schema, at all until you brought it up. * the user has a field named key * the user had a question about deleting

Re: Solr 4 memory usage increase

2013-05-20 Thread Chris Hostetter
: We have master/slave setup. We disabled autocommits/autosoftcommits. So the : slave only replicates from master and serve query. Master does all the : indexing and commit every 5 minutes. Slave polls master every 2.5 minutes : and does replication. Details matter... Are you using hte exact

Re: Solr httpCaching for distinct handlers

2013-05-20 Thread Chris Hostetter
: Hi everybody, I would like to have distinct httpCaching configuration for : distinct handlers, i.e if a request comes for select, send a cache control : header of 1 minute ; and if receive a request for mlt then send a cache : control header of 5 minutes. : Is there a way to do that in my

Re: Wide vs Tall document in Solr 4.2.1

2013-05-18 Thread Chris Hostetter
: We recently decided to move from Solr version 3.5 to 4.2.1. The transition ... : Most of the fields are multiValued (type String) and the size of array in : those vary from 5 to 50K. So our 30% of popular documents are tall. Not all ... : Issues that we observed is high CPU and

Re: Oracle Timestamp in SOLR

2013-05-16 Thread Chris Hostetter
: SELECT ... CAST(LAST_ACTION_TIMESTAMP AS DATE) AS LAT : : This removes the time part of the timestamp in SOLR. althought it is shown : in PL/SQL-Developer (Tool for Oracle). Hmmm... that makes no sense to me based on 10 seconds of googling...

Re: Facets referenced by key

2013-05-16 Thread Chris Hostetter
: I would then like to refer to these 'pseudo' field later in the request : string. I thought this would be how I'd do it: : : f.my_facet_key.facet.prefix=a_given_prefix ... that syntax was proposed in SOLR-1351 and a patch was made available, but it was never commited (it only

RE: Deleting an entry from a collection when they key has : in it

2013-05-16 Thread Chris Hostetter
: If in my schema, I have the key field set to indexed=false, then is that : maybe the issue? I'm going to try to set that to true and rebuild the : repository and see if that does it. if a field is indexed=false you can not query on it. if you can not query on a field, then you can not delete

Re: Solr Range Queries with Field value

2013-05-15 Thread Chris Hostetter
: After some research the following syntax worked : start_time_utc_epoch:[1970-01-01T00:00:00Z TO : _val_:merchant_end_of_day_in_utc_epoch]) that syntax definitely does not work ... i don't know if there is a typo in your mail, or if you are just getting strange results that happen to look

Re: Hierarchical Faceting

2013-05-15 Thread Chris Hostetter
: Subject: Hierarchical Faceting : References: : 15062_1368600769_zzi0n0aykpk6h.00_519330be.7000...@uni-bielefeld.de : In-Reply-To: : 15062_1368600769_zzi0n0aykpk6h.00_519330be.7000...@uni-bielefeld.de https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists

Re: Can we search some mandatory words and some optional words in SOLR

2013-05-15 Thread Chris Hostetter
: +Java +mysql +php TCL Perl Selenium -ethernet -switching -routing that's missing one of the started requirements... : 2. Atleast one keyword out of* TCL Perl Selenium* should be present ...should be... +Java +mysql +php +(TCL Perl Selenium) -ethernet -switching -routing -Hoss

Re: replication without automated polling, just manual trigger?

2013-05-15 Thread Chris Hostetter
: In Solr 1.4, on slave, I supplied a masterUrl, but did NOT supply any : pollInterval at all on slave. I did NOT supply an enable : false in slave, because I think that would have prevented even manual : replication. that exact same config should still work with solr 4.3 : This seemed to

Re: Faceting json response - odd format

2013-05-13 Thread Chris Hostetter
: This is what i am getting which mirrors what is in the documentation: : : facet_counts:{facet_queries:{}, : facet_fields:{metadata_meta_last_author:[Nick,330,standarduser,153,Mohan,52,wwd,49,gerald,45,Riggins,36,fallon,31,blister,28,

Re: Faceting json response - odd format

2013-05-13 Thread Chris Hostetter
: What i would prefer to see as we do with all other parameters is a : normal key/value pairing. this might look like: a true key value pairing via a map type structure is what you get with json.nl=map -- but in most client langauges that would lose whatever sorting you might have specified

Re: How to get/set customized Solr data source properties?

2013-05-13 Thread Chris Hostetter
: learned it should work. And this is my actual code. I create this : DataSource for testing my ideas. I am blocked at the very beginning...sucks : :( but you only showed us one line of code w/o any context. nothing in your email was reproducible for other people to try to compile/run

Re: How to deal with cache for facet search when index is always increment?

2013-05-13 Thread Chris Hostetter
: For real time seach, the docs would be import to index anytime. In this : case, the cache is nealy always need to create again, which cause the facet : seach is very slowly. : Do you have any idea to deal with such problem? : We're in a similar situation and have had better performance

Re: Quick question about indexing with SolrJ.

2013-05-13 Thread Chris Hostetter
: I don't want to use POJOs, that's the main problem. I know that you can : send AJAX POST HTTP Requests with JSON data to index new documents and I : would like to do that with SolrJ, that's all, but I don't find the way to : do that, :-/ . What I would like to do is simple retrieve an String

Re: Query using function query result

2013-05-08 Thread Chris Hostetter
: i want to query documents which match a certain dynamic criteria. : like, How do i get all documents, where sub(field1,field2) 0 ? : : i tried _val_: sub(field1,field2) and used fq:[_val_:[0 TO *] take a look at the frange QParser...

Re: solr 4.2.1 and docValues

2013-05-08 Thread Chris Hostetter
: Questions: : - what is the advantage of having indexed=true and docvalues=true? indexed=true and docValues=true are orthoginal. it might make sense to use both if you wnated to do term queries on the field but also faceting -- because indexed tems are generally faster for queries, but

Re: Numeric fields and payload

2013-05-08 Thread Chris Hostetter
: is it possible to store (text) payload to numeric fields (class : solr.TrieDoubleField)? My goal is to store measure units to numeric : features - e.g. '1.5 cm' - and to use faceted search with these fields. : But the field type doesn't allow analyzers to add the payload data. I : want to

Re: atomic updates w/ double field

2013-05-08 Thread Chris Hostetter
: I'm using solr 4.0 and I'm using an atomic update to increment a tdouble : 3 times with the same value (99.4). The third time it is incremented the : values comes out to 298.25. Has anyone seen this error or : how to fix it? Maybe I should use the regular double instead of a :

Re: Oracle Timestamp in SOLR

2013-05-08 Thread Chris Hostetter
: I have a field with the type TIMESTAMP(6) in an oracle view. ... : What is the best way to import it? ... : This way works but I do not know if this is the best practise: ... : TO_CHAR(LAST_ACTION_TIMESTAMP, '-MM-DD HH24:MI:SS') as LAT instead of having your

Re: FieldCache insanity with field used as facet and group

2013-05-07 Thread Chris Hostetter
: I am using the Lucene FieldCache with SolrCloud and I have insane instances : with messages like: FWIW: I'm the one that named the result of these sanity checks FieldCacheInsantity and i have regretted it ever since -- a better label would have been inconsistency : VALUEMISMATCH: Multiple

Re: Search identifier fields containing blanks

2013-05-07 Thread Chris Hostetter
: I am about to index identfier fields containing blanks (shelfmarks) eg. G : 23/60 12 : The field type is set to Solr.string. To get the exact matching hit (the doc : with shelfmark mentioned above) the user must quote the search term. Is there : a way to omit the quotes? whitespace has to be

Re: Unsubscribing from JIRA

2013-05-07 Thread Chris Hostetter
: Email filters? I mean, you may have a point, but the cost of change at : this moment is probably too high. Personal email filters, on the other : hand, seems like an easy solution. The reason for having Jira notifications go to the devs list is that all of the comments discussion in jira are

Re: Prons an Cons of Startup Lazy a Handler?

2013-04-26 Thread Chris Hostetter
: In short, whether you want to keep the handler is completely independent of : the lazy startup option. I think Jack missread your question -- my interpretation is that you are asking about the pros/cons of removing 'startup=lazy' ... : requestHandler name=/update/extract :

Re: How to get/set customized Solr data source properties?

2013-04-26 Thread Chris Hostetter
: : I am working on a DataSource implementation. I want to get some customized : properties when the *DataSource.init* method is called. I tried to add the ... : dataConfig : dataSource type=com.my.company.datasource : my=value / My understanding from looking at other

Re: Solr index searcher to lucene index searcher

2013-04-26 Thread Chris Hostetter
: used to call the lucene IndexSearcher . As the documents are collected in : TopDocs in Lucene , before that is passed back to Nutch , i used to look : into the top K matching documents , consult some external repository : and further score the Top K documents and reorder them in the TopDocs

Re: Solr Cloud 4.2 - Distributed Requests failing with NPE

2013-04-25 Thread Chris Hostetter
: trace:java.lang.NullPointerException\r\n\tat : org.apache.solr.handler.component.HttpShardHandler.checkDistributed(HttpShardHandler.java:340)\r\n\tat : org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:182)\r\n\tat yea, definitely a bug. Raintung

Re: How do set compression for compression on stored fields in SOLR 4.2.1

2013-04-25 Thread Chris Hostetter
: Subject: How do set compression for compression on stored fields in SOLR 4.2.1 : : https://issues.apache.org/jira/browse/LUCENE-4226 : It mentions that we can set compression mode: : FAST, HIGH_COMPRESSION, FAST_UNCOMPRESSION. The compression details are hardcoded into the various codecs. If

Re: Noob question: why doesn't this query work?

2013-04-24 Thread Chris Hostetter
: This could be the problem. This is query is machine generated, so I don't : care how ugly it is. Does this apply even to inner queries? I.e., should : that last clause be (*:* -i_4:6142E=m) instead of (NOT I-4:6142E=m)? yes -- you can't exclude 6142E=m w/o defining what set (ie: the set of

Re: solr.StopFilterFactory doesn't work with wildcard

2013-04-24 Thread Chris Hostetter
: In any case, technically, the stop filter is doing exactly what it is supposed : to do. Jack has kind of glossed over some key questions here... 1) why are you using StopFilterFactory in your multiterm analyzer like this? 2) what do you expect it to do if series is in your stopwords and

Re: Solr consultant recommendation

2013-04-24 Thread Chris Hostetter
: Subject: Solr consultant recommendation : In-Reply-To: e8a79384-5570-4777-b90c-c59c89cf4...@cominvent.com https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead

Re: Is there a way to load multiple schema when using zookeeper?

2013-04-23 Thread Chris Hostetter
: Yes, you can effectively chroot all the configs for a collection (to : support multiple collections in same ensemble) - see wiki: : http://wiki.apache.org/solr/SolrCloud#Zookeeper_chroot I don't think chroot is suitable for what's being asked about here ... that would completely isolate two

Re: Too many close, count -1

2013-04-23 Thread Chris Hostetter
: Subject: Re: Too many close, count -1 Thanks for the details, nothing jumps out at me, but we're now tracking this in SOLR-4753... https://issues.apache.org/jira/browse/SOLR-4753 -Hoss

Re: Solr index searcher to lucene index searcher

2013-04-23 Thread Chris Hostetter
: . For any query it passes through the search handler and solr finally : directs it to lucene Index Searcher. As results are matched and collected : as TopDocs in lucene i want to inspect the top K Docs , reorder them by : some logic and pass the final TopDocs to solr which solr may send

Re: Solr 3.6.1: changing a field from stored to not stored

2013-04-23 Thread Chris Hostetter
: index? I noticed I am unnecessarily storing some fields in my index and : I'd like to stop storing them without having to 'reindex the world' and : let the changes just naturally percolate into my index as updates come : in the normal course of things. Do you guys think I could get away

Re: updating documents unintentionally adds extra values to certain fields

2013-04-22 Thread Chris Hostetter
: I am using solr 4.2, and have set up spatial search config as below : : http://wiki.apache.org/solr/SpatialSearch#Schema_Configuration : : But everything I make an update to a document, : http://wiki.apache.org/solr/UpdateJSON#Updating_a_Solr_Index_with_JSON : : more values of the

Re: Dynamically loading Elevation Info

2013-04-22 Thread Chris Hostetter
: In-Reply-To: 1366609851170-4057812.p...@n3.nabble.com : References: 1366383543826-4057312.p...@n3.nabble.com : b3fc0696-667a-4c41-bc33-42e290574...@gmail.com : 1366609851170-4057812.p...@n3.nabble.com : Subject: Dynamically loading Elevation Info

Re: Too many close, count -1

2013-04-22 Thread Chris Hostetter
: Can you tell what operations cause this to happen? ie: what does your configuration look like? are you using any custom plugins? what types of features of solr do you use (faceting, grouping, highlighting, clustering, dih, etc...) ? -Hoss

Re: facet.method enum vs fc

2013-04-19 Thread Chris Hostetter
: Thanks for your kind reply. The problem is solved with sharding and using : facet.method=enum. I am curious about what's the different between enum : and fc, so that enum works but fc does not. Do you know something about : this? method=fc/fcs uses the field caches (or uninverted fields

Re: Update Request Processor Chains

2013-04-19 Thread Chris Hostetter
: I am trying to understand update request processor chains. Do they runs one : by one when indexing a ducument? Can I identify multiple update request : processor chains? Also what are that LogUpdateProcessorFactory and : RunUpdateProcessorFactory?

Re: Using multiple text files for Suggestor dictionarys

2013-04-17 Thread Chris Hostetter
: Is it possible to use multiple text files? I tried the following: ... : But the second list, the cities, are apparently undetected, after : restarting the tomcat and rebuilding the dictionary. Can this be done? If : not, how would you recommend managing different dictionaries? Skimming

Re: Max http connections in CloudSolrServer

2013-04-17 Thread Chris Hostetter
: Side issue: shouldn't that be setMaxConnectionsPerHost instead of including : the word Default? If there's no objection, I would plan on adding the renamed : method and using a typical deprecation procedure for the old one. I think the name comes from the effect it has on the underlying

Re: dataimporter.last_index_time SolrCloud

2013-04-17 Thread Chris Hostetter
: Is this a bug? I can create the ticket in Jira if it is, but it's not clear : to me what should be happening. It certainly sounds like it, but i too am not certian what is actaully suppose to be happening here, or why it changed. Please open a jira with the details of your DIH requestHandler

Re: updateLog in Solr 4.2

2013-04-16 Thread Chris Hostetter
: : If i disable update log in solr 4.2 then i get the following exception : SEVERE: :java.lang.NullPointerException : at : org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190) Hmmm.. if you don't have updateLog and you run in SolrCloud mode, solr

Re: Solr 4.2.x replication events on slaves

2013-04-16 Thread Chris Hostetter
: In Solr 3.x, I was relying on a postCommit call to a listener in the update : handler to perform data update to caches, this data was used to perform : 'realtime' filtering on the documents. I can't find it at the moment, but IIRC this was a side effect of how snapshots are now loaded on

Re: Troubles with solr replication

2013-04-16 Thread Chris Hostetter
: Also when I checked the solr log. : : [org.apache.solr.handler.SnapPuller] Master at: : http://192.168.2.204:8080/solr/replication is not available. Index fetch : failed. Exception: Connection refused : : : BTW, I was able to fetch the replication file with wget directly. Are you certian

Re: Empty Solr 4.2.1 can not create Collection

2013-04-16 Thread Chris Hostetter
: sorry for pushing, but I just replayed the steps with solr 4.0 where : everything works fine. : Then I switched to solr 4.2.1 and replayed the exact same steps and the : collection won't start and no leader will be elected. : : Any clues ? : Should I try it on the developer mailing list, maybe

Re: Is cache useful for my scenario?

2013-04-16 Thread Chris Hostetter
: There will be a lot of data that will be indexed in Solr. My question is, : does caching help in my case? As the filter queries will vary for almost all : users ( because the viewport latitude/longitude would vary), in what ways : can I use Caching to increase performance. Should I completely

Re: JavaScript transform switch statement during Data Import

2013-04-16 Thread Chris Hostetter
: Hello - I'm trying to add a switch statement into a JavaScript function that : we use during an import; it's to replace an if else block that is becoming : increasingly large. : : Bizarrely, the switch block is ignore entirely, and it doesn't have any : effect whatsoever. you haven't really

Re: Query Parser OR AND and NOT

2013-04-15 Thread Chris Hostetter
: Hallo, : I do not really understand the query language of the SOLR-Queryparser. http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/ The one comment i would add regarding your specific examples... : (!city:H*) OR zip:30*numFound: 2896 ...you can't have a boolean

Re: Downloaded Solr 4.2.1 Source: Build Failing

2013-04-12 Thread Chris Hostetter
: /Users/umeshprasad/Downloads/solr-4.2.1/solr/core/src/java/org/apache/solr/handler/c : *omponent/QueryComponent.java:765: cannot find symbol : [javac] symbol : class ShardFieldSortedHitQueue : [javac] location: class org.apache.solr.handler.component.QueryComponent : [javac]

Re: Solr 4.2.1 SSLInitializationException

2013-04-12 Thread Chris Hostetter
: Thanks for your response.  As I mentioned in my email, I would prefer : the application to not have access to the keystore. Do you know if there I'm confused ... it seems that you (or GlassFish) has created a Catch-22... You say you don't want the application to have access to the

Re: CSS appearing in Solr 4.2.1 logs

2013-04-12 Thread Chris Hostetter
: This sounds crazy, but does anyone see strange CSS/HTML in their Solr 4.2.x : logs? are you sure you're running 4.2.1 and not 4.2? https://issues.apache.org/jira/browse/SOLR-4573 -Hoss

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-09 Thread Chris Hostetter
: And with replication?command=details I also see the correct commit part as : above, BUT where the hell are the wrong info below the commit array are : coming from? Please read the details in the previously mentioned Jira issue... https://issues.apache.org/jira/browse/SOLR-4661 The

Re: Solr 4.2.1 SSLInitializationException

2013-04-09 Thread Chris Hostetter
: Deploying Solr 4.2.1 to GlassFish 3.1.1 results in the error below.  I : have seen similar problems being reported with Solr 4.2 Are you trying to use server SSL with glassfish? can you please post the full stack trace so we can see where this error is coming from. My best guess is that

Re: Execution of Queries in Parallel: geotagged textual documents in Solrvvvv

2013-04-09 Thread Chris Hostetter
: I'd move to SolrCloud 4.2.1 to benefit from sharding, replication, and : the latest Lucene. How many queries you will then be able to run in : parallel will depend on their complexity, index size, query : cachability, index size, latency requirements... But move to the : latest setup first.

Re: How can I set configuration options?

2013-04-09 Thread Chris Hostetter
: Thanks for the replies. The problem I have is that setting them at the JVM : level would mean that all instances of Solr deployed in the Tomcat instance : are forced to use the same settings. I actually want to set the properties : at the application level (e.g. in solr.xml, zoo.conf or maybe an

Re: Results Order When Performing Wildcard Query

2013-04-09 Thread Chris Hostetter
: My gut says the difference in assignment of docids has to do with how the : FileListEntityProcessorhttp://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor docids just represent the order documents are added to the index. if you use DIH with FileListEntityProcessor to create

Re: solr 4.2.1 still has problems with index version and index generation

2013-04-08 Thread Chris Hostetter
: I know there was some effort to fix this but I must report : that solr 4.2.1 has still problems with index version and : index generation numbering in master/slave mode with replication. ... : RESULT: slave has different (higher) version number and is with generation 1 ahead :-( Can

Re: Score after boost before

2013-04-08 Thread Chris Hostetter
: I am using edismax and boosting certain fields using bq during query time. : : I would like to compare effect of boost side by side with original score : without boost. Is there anyway i can get original score without boosting? using functions and DocTransformers, it's possible to get the

Re: Sub field indexing

2013-04-08 Thread Chris Hostetter
: Subject: Sub field indexing : References: 1365426517091-4054473.p...@n3.nabble.com : In-Reply-To: 1365426517091-4054473.p...@n3.nabble.com https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply

Re: Number of segments

2013-04-08 Thread Chris Hostetter
: How do I determine how many tiers it has? You may find this blog post from mccandless helpful... http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html (don't ignore the videos! watching them really helpful to understand what he is talking about) Once you've

Re: SolrCloud not distributing documents across shards

2013-04-03 Thread Chris Hostetter
: So we indexed a set of 33010 documents on server01 which are now in shard1. : And we kicked off a set of 85934 documents on server02 which are now in : shard2 (as tests). In my understanding of how SolrCloud works, the : documents should be distributed across the shards in the collection. Now

Re: Solr URL uses non-standard format with pound sign

2013-04-02 Thread Chris Hostetter
: The Solr URL in Solr 4.2 for my localhost installation looks like this: : http://localhost:8883/solr/#/development_shard1_replica1 : : This URL when constructed dynamically in Ruby will not validate with the : Ruby URI:HTTP class because of the # sign in the path. This is a : non-standard URL

Re: Lengthy description is converted to hash symbols

2013-04-02 Thread Chris Hostetter
: Here is an example of the field's value: : str :

Re: Making tika process mail attachments eludes me

2013-04-01 Thread Chris Hostetter
: I believe that the handling of the multipart MIME lacks some error checking, and : it is probably related to the content outside the MIME boundaries (in my : example, the text This is a multi-part message in MIME format.): : : I really hope that some SOLR developer can have a look, we cannot

Re: Use of SolrJettyTestBase

2013-04-01 Thread Chris Hostetter
: I've subclassed SolrJettyTestBase, and added a test method (annotated : with @test). However, my test method is never called. I see the You got an immediate failure from the tests setup, because you don'th ave assertions enabled in your JVM (the Lucene Solr test frameworks both require

Re: Suggestions for Customizing Solr Admin Page

2013-04-01 Thread Chris Hostetter
: I want to customize Solr Admin Page. I think that I will need more : complicated things to manage my cloud. I will separate my Solr cluster into : just indexing ones and just response ones. I will index my documents by : categorical and I will index them at different collections. A key design

Re: multiple SolrCloud clusters with one ZooKeeper ensemble?

2013-03-28 Thread Chris Hostetter
: Can I use a single ZooKeeper ensemble for multiple SolrCloud clusters or : would each SolrCloud cluster requires its own ZooKeeper ensemble? https://wiki.apache.org/solr/SolrCloud#Zookeeper_chroot (I'm going to FAQ this) -Hoss

Re: Batch Search Query

2013-03-28 Thread Chris Hostetter
: Now, what happens is a user will upload say a word document to us. We then : parse it and process it into segments. It very well could be 5000 segments : or even more in that word document. Each one of those ~5000 segments needs : to be searched for similar segments in solr. I’m not quite sure

Re: How to update synonyms.txt without restart?

2013-03-28 Thread Chris Hostetter
: But solr wiki says: : ``` : Starting with Solr4.0, the RELOAD command is implemented in a way that : results a live reloads of the SolrCore, reusing the existing various : objects such as the SolrIndexWriter. As a result, some configuration : options can not be changed and made active with a

Re: SOLR - Unable to execute query error - DIH

2013-03-28 Thread Chris Hostetter
: I am trying to index data from SQL Server view to the SOLR using the DIH Have you ruled out the view itself being the bottle neck? Try running whatever command line SQLServer client exists on your SOLR server to connect remotely to your existing SQL server and run select * from view and

Re: Elasticsearch with kerberos

2013-03-27 Thread Chris Hostetter
: Is there any integration of Solr with Kerberos? : I am pretty sure that the answer is no. Solr has no security features at : all - it is intended to live where regular users cannot get to it. The key question is how you define integration of Solr with Kerberos ? what is your goal? How

Re: Query question

2013-03-26 Thread Chris Hostetter
: So as I said, the search result I want is the one with the highest score, : but I was hoping to find a way to boost the score based on the number of : terms it finds (or matches well) so that I can differentiate between a close : match and nowhere near. Any suggestions? In general, this

Re: Conditional Field Search without affecting score.

2013-03-26 Thread Chris Hostetter
: document accordingly. This works good in most cases. but we had a case where : we ran into issue. : : DocA // Common title and is same for all county so no additional titles. : title.0Fightertitle.0 : : DocB : title.0The Ultimate Street Fightertitle.0 // Default : title.1Ultimate

Re: To get Term Offsets of a term per document

2013-03-26 Thread Chris Hostetter
: Is there a way to get Term Offsets of a given term per document without : enabling the termVectors ? : : Is it that Lucene index stores the positions but not the offsets by default : - is it correct ? correct -- unless you specifically enable termVectors, the offset information isn't

Re: Custom ValueSource, filtering with frange, and caching woes

2013-03-26 Thread Chris Hostetter
: I have a custom ValueSource that I'd like to use as a filter, something : like: ... : My question is whether I should expect Solr to cache this filter in the : filterCache? In other words, is there any reason to expect frange filters Did you remember to implement consistent and

Re: [ScriptUpdateProcessor] Params aren't being picked up from solrconfig

2013-03-26 Thread Chris Hostetter
: none of the params I specify in solrconfig.xml are being picked up. The : error I'm getting is: NameError: global name 'params' is not defined. ... : updateRequestProcessorChain name=script : processor class=solr.StatelessScriptUpdateProcessorFactory : str

Re: Loadtesting solr/tomcat7 and tomcat stops responding entirely

2013-03-26 Thread Chris Hostetter
: * When I set solrmeter to run 4000 queries/min, it will handle a few : hundred queries and then tomcat will stop responding completely to requests : (even though according to lsof -i it is still listening and the java : process is still running). have you tried tacking using jstack to generate

Re: lucene 42 codec

2013-03-25 Thread Chris Hostetter
: I noticed that apache solr 4.2 uses the lucene codec 4.1. How can I : switch to 4.2? Unless you've configured something oddly, Solr is already using the 4.2 codec. What you are probably seeing is that the fileformat for several types of files hasn't changed from the 4.1 (or even 4.0)

Re: DocValues and field requirements

2013-03-22 Thread Chris Hostetter
: Thank you for your response. Yes, that's strange. By enabling DocValues the : information about missing fields is lost, which changes the way of sorting : as well. Adding default value to the fields can change a logic of : application dramatically (I can't set default value to 0 for all :

Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-22 Thread Chris Hostetter
: parameter *omitTermFreqAndPositions* the key thing to remember being: if you use this, then by omiting positions you can no longer do phrase queries. : or you can use a custom similarity class that overrides the term freq and : return one for only that field. :

Re: how to get term vector information of sepcific word/position in field

2013-03-22 Thread Chris Hostetter
: is there any way, if i can get term vector information of specific word : only, like i can pass the word, and it will just return term position and : frequency for that word only? : : and also if i can pass the position e.g. startPosition=5 and endPosition=10; : then it will return terms,

Re: Don't cache filter queries

2013-03-21 Thread Chris Hostetter
: Just add {!cache=false} to the filter in your query : (http://wiki.apache.org/solr/SolrCaching#filterCache). ... : I need to use the filter query feature to filter my results, but I : don't want the results cached as documents are added to the index : several times per second and the

Re: custom similary on a field not working

2013-03-21 Thread Chris Hostetter
: public class NoTfSimilarity extends DefaultSimilarity { : public float tf(float freq) { : return freq 0 ? 1.0f : 0.0f; : } : } ... : But I still see tf=14 in my query?? ... : 1.0 = tf(freq=14.0), with freq of: :

Re: Spatial Search with document score as distance between two points

2013-03-21 Thread Chris Hostetter
: q={!func}geodist()sfield=latlngpt=28.635308,77.22496sort=score+asc ... : Problem : For those documents which doesn't have latlng field, value is coming exceptionally large '8763.191'. I'm pretty sure you're seeing the function assume a default lat,lon of 0,0 for docs thta you haven't

Re: change default solr url /solr to /prodsolr

2013-03-20 Thread Chris Hostetter
: currently when we deploy, default url will be http://host:port/solr : : how can i change it to http://host:port/prodsolr? : : i am using jboss server. what you're asking about is the name, or somethings refered to as the context path, of the servlet context for the solr applicaiton. how

Re: Facets with 5000 facet fields

2013-03-20 Thread Chris Hostetter
: I seem to recall that facet cache is not per segment so every time the : index is updated the facet cache will need to be re-computed. : : That is correct. We haven't experimented with segment based faceting Not true ... per segment FIeldCache support is available in solr faceting, you

Re: Help getting a document by unique ID

2013-03-19 Thread Chris Hostetter
: Which is the problem- you might think that 60ms unique key accesses : (what I'm seeing) is more than good enough- and for most use cases, : you'd be right. But it's not unusual for a single web-page hit to : generate many dozens, if not low hundreds, of calls to get document by : id. At which

Re: Facets with 5000 facet fields

2013-03-19 Thread Chris Hostetter
: In order to support faceting, Solr maintains a cache of the faceted : field. You need one cache for each field you are faceting on, meaning : your memory requirements will be substantial, unless, I guess, your 1) you can consider trading ram for time by using facet.method=enum (and disabling

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-19 Thread Chris Hostetter
Deja-vu? http://mail-archives.apache.org/mod_mbox/lucene-general/201303.mbox/%3CCAHd9_iR-HtNDu-3a9A5ekTFdb+5mo1eWVcu4Shp8AD=qtpq...@mail.gmail.com%3E http://mail-archives.apache.org/mod_mbox/lucene-general/201303.mbox/%3Calpine.DEB.2.02.1303081644040.5502@frisbee%3E

Re: Solr3.5 Vs Solr4.1 - Help please

2013-03-15 Thread Chris Hostetter
: Just from this observation, it seems like the code for SOLR 4.1 takes a : wrong turn somewhere for large responses if it comes across the same query : with a different fl list again.If the spinning query is pre-cached via There definiately seems to be a problem with lazy field loading +

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-14 Thread Chris Hostetter
: It looks strange to me that if there is no document yet (foundVersion 0) : then the only case when document will be imported is when input version is : negative. Guess I need to test specific cases using SolrJ or smth. to be sure. you're assuming that if foundVersion 0 that means no document

Re: How can I limit my Solr search to an arbitrary set of 100,000 documents?

2013-03-12 Thread Chris Hostetter
: q=title:dogs AND : (flrid:(123 125 139 34823) OR : flrid:(34837 ... 59091) OR : ... OR : flrid:(101294813 ... 103049934)) : The problem with this approach (besides that it's clunky) is that it : seems to perform O(N^2) or so. With 1,000 FLRIDs,

Re: JoinQuery and scores

2013-03-12 Thread Chris Hostetter
: Every doc returned has a score of 1.0 with the join. : Without join I get scores between 0.40337953 and 0.40530312. Saying you get scores w/o joining is an apples/oranges comparison -- w/o joining you have a completley differnet query matching differnet documents. what the JoinQParser does

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Chris Hostetter
: 2) If you wish to use the /schema REST API for read and write operations, : then schema information will be persisted under the covers in a data store : whose format is an implementation detail just like the index file format. : : This really needs to be driven by costs and benefits... :

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Chris Hostetter
To revisit sarowes comment about how/when to decide if we are using the config file version of schema info (and hte API is read only) vs internal managed state data version of schema info (and the API is read/write)... On Wed, 6 Mar 2013, Steve Rowe wrote: : Two possible approaches: : : a.

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Chris Hostetter
: we needed to, we could just assert that the schema file is the : persistence mechanism, as opposed to the system of record, hence if : you hand edit it and then use the API to change it, your hand edit may : be lost. Or we may decide to do away with local FS mode altogether. presuming that

<    5   6   7   8   9   10   11   12   13   14   >