Re: Overlapping onDeckSearchers=2

2013-05-27 Thread Yonik Seeley
On Mon, May 27, 2013 at 7:11 AM, Jack Krupansky j...@basetechnology.com wrote: The intent is that optimize is obsolete and should no longer be used That's incorrect. People need to understand the cost of optimize, and that it's use is optional. It's up to the developer to figure out of the

Re: Replica shards not updating their index when update is sent to them

2013-05-20 Thread Yonik Seeley
On Mon, May 20, 2013 at 4:21 PM, Sebastián Ramírez sebastian.rami...@senseta.com wrote: When I send an update to a non-leader (replica) shard (B), the updated results are reflected in the leader shard (A) and in the other replica shard (C), but not in the shard that received the update (B).

Re: Transaction Logs Leaking FileDescriptors

2013-05-16 Thread Yonik Seeley
On Wed, May 15, 2013 at 6:04 PM, Steven Bower sbo...@alcyon.net wrote: They are visible to ls... On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley yo...@lucidworks.com wrote: On Wed, May 15, 2013 at 5:20 PM, Steven Bower sbo...@alcyon.net wrote: when the TransactionLog objects

Re: Function queries

2013-05-15 Thread Yonik Seeley
On Wed, May 15, 2013 at 7:25 AM, sathish_ix skandhasw...@inautix.co.in wrote: Hi , i would like to get all documents when searching for a keyword. http://localhost:8080/solr/select?q=caramrows=_val_:docfreq(SEARCH_TERM,'caram') Searching for 'caram', there are 200 documents, but iam getting

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Yonik Seeley
Hmmm, we keep open a number of tlog files based on the number of records in each file (so we always have a certain amount of history), but IIRC, the number of tlog files is also capped. Perhaps there is a bug when the limit to tlog files is reached (as opposed to the number of documents in the

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Yonik Seeley
On Wed, May 15, 2013 at 5:20 PM, Steven Bower sbo...@alcyon.net wrote: I'm hunting through the UpdateHandler code to try and find where this happens now.. UpdateLog.addOldLog() -Yonik http://lucidworks.com

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Yonik Seeley
On Wed, May 15, 2013 at 5:20 PM, Steven Bower sbo...@alcyon.net wrote: when the TransactionLog objects are dereferenced their RandomAccessFile object is not closed.. Have the files been deleted (unlinked from the directory), or are they still visible via ls? -Yonik http://lucidworks.com

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Yonik Seeley
On Wed, May 15, 2013 at 5:06 PM, Steven Bower sbo...@alcyon.net wrote: This leads me to believe that the TransactionLog is not properly closing all of it's files before getting rid of the object... I tried some ad hoc tests, and I can't reproduce this behavior yet. There must be some other

Re: Solr group.order not work

2013-05-15 Thread Yonik Seeley
group.order is not a valid parameter. You're probably looking for group.sort -Yonik http://lucidworks.com On Wed, May 15, 2013 at 9:30 PM, alexzhang zhangming1...@gmail.com wrote: I use the Solr 4.0.0 +, when I try to sort the results which within one group, it does not work? The wiki I

Re: How to improve performance of geodist()

2013-05-13 Thread Yonik Seeley
On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding nicholas...@gmail.com wrote: I'm using geodist() in a recip boost function. I noticed a performance impact to the response time. I did a profiling session, the geodist() calculation took 30% of CPU time. Are you also using an fq with geofilt to

Re: stats cache

2013-05-07 Thread Yonik Seeley
On Tue, May 7, 2013 at 12:48 PM, J Mohamed Zahoor zah...@indix.com wrote: Hi I am computing lots of stats as part of a query… looks like the solr caching is not helping here… Does solr caches stats of a query? No. Neither facet counts or stats part of a request are cached. The query cache

Re: [solr 3.4] anomaly during distributed facet query with 102 shards

2013-04-25 Thread Yonik Seeley
On Thu, Apr 25, 2013 at 8:32 AM, Dmitry Kan solrexp...@gmail.com wrote: Are there any distrib facet gurus on the list? I would be ready to try sensible ideas, including on the source code level, if someone of you could give me a hand. The Lucene/Solr Revolution conference is coming up next

Re: Reordered DBQ.

2013-04-23 Thread Yonik Seeley
On Tue, Apr 23, 2013 at 3:51 PM, Marcin Rzewucki mrzewu...@gmail.com wrote: Recently I noticed a lot of Reordered DBQs detected messages in logs. As far as I checked in logs it could be related with deleting documents, but not sure. Do you know what is the reason of those messages ? For high

Re: Bug? JSON output changes when switching to solr cloud

2013-04-22 Thread Yonik Seeley
Thanks David, I've confirmed this is still a problem in trunk and opened https://issues.apache.org/jira/browse/SOLR-4746 -Yonik http://lucidworks.com On Sun, Apr 21, 2013 at 11:16 PM, David Parks davidpark...@yahoo.com wrote: We just took an installation of 4.1 which was working fine and

Re: Too many close, count -1

2013-04-22 Thread Yonik Seeley
Can you tell what operations cause this to happen? I've added a comment to https://issues.apache.org/jira/browse/SOLR-4749 where we're looking at some related issues around CoreContainer, but perhaps it should get it's own issue. -Yonik http://lucidworks.com On Mon, Apr 22, 2013 at 7:57 PM,

Re: Solr cloud and batched updates

2013-04-21 Thread Yonik Seeley
On Sun, Apr 21, 2013 at 11:57 AM, Timothy Potter thelabd...@gmail.com wrote: There's no problem here, but I'm curious about how batches of updates are handled on the Solr server side in Solr cloud? Going over the code for DistributedUpdateProcessor and SolrCmdDistributor, it appears that the

Re: TooManyClauses: maxClauseCount is set to 1024

2013-04-18 Thread Yonik Seeley
Can you provide a full stack trace of the exception? There's a maxClauseCount in solrconfig.xml that you can increase to work around the issue. -Yonik http://lucidworks.com On Thu, Apr 18, 2013 at 7:31 AM, sawanverma sawan.ve...@glassbeam.com wrote: Its quite confusing about this error. I

Re: Solr 4.2 fl issue

2013-04-18 Thread Yonik Seeley
When using a field name that doen't follow conventions (basically like Java identifiers), try this: fl=field(098765-765-788558-7654_userid) Or enclose it in quotes if it's really a whacky field name: fl=field(098765-765-788558-7654_userid) -Yonik http://lucidworks.com On Thu, Apr 18, 2013 at

Re: Why filter query doesn't use the same query parser as the main query?

2013-04-17 Thread Yonik Seeley
On Tue, Apr 16, 2013 at 9:44 PM, Roman Chyla roman.ch...@gmail.com wrote: Is there some profound reason why the defType is not passed onto the filter query? defType is a convenience so that the main query parameter q can directly be the user query (without specifying it's type like edismax).

Re: Function Query performance in combination with filters

2013-04-16 Thread Yonik Seeley
On Tue, Apr 16, 2013 at 7:51 AM, Rogalon nico.beche...@me.com wrote: Hi, I am using pretty complex function queries to completely customize (not only boost) the score of my result documents that are retrieved from an index of approx 10e7 documents. To get to an acceptable level of performance

Re: Combining join queries

2013-04-11 Thread Yonik Seeley
On Wed, Apr 10, 2013 at 7:33 AM, Upayavira u...@odoko.co.uk wrote: On Wed, Apr 10, 2013, at 12:22 PM, Upayavira wrote: I'm sure the best way for me to solve this issue myself is to ask it publicly, so... If I have two {!join} queries that select a collection of documents each, how do I

Re: Boost parameter with query function - how to pass in complex params?

2013-04-07 Thread Yonik Seeley
On Sun, Apr 7, 2013 at 8:39 AM, dc tech dctech1...@gmail.com wrote: Yonik, Many thanks. The OR is still not working... here is the full URL 1. Honda or Toyota individually work http://localhost:8983/solr/cars/select?fl=text,scoredefType=edismaxq=suvboost=query($boostq,1)boostq=honda

Re: Boost parameter with query function - how to pass in complex params?

2013-04-07 Thread Yonik Seeley
On Sun, Apr 7, 2013 at 10:11 AM, dc tech dctech1...@gmail.com wrote: Yonik: Pasted the wrong URL as I was trying various things. I did not work with OR http://localhost:8983/solr/cars/select?fl=text,scoredefType=edismaxq=suvboost=query($boostq,1)boostq=toyota%20OR%20hondadebug=true See

Re: Boost parameter with query function - how to pass in complex params?

2013-04-06 Thread Yonik Seeley
On Sat, Apr 6, 2013 at 9:42 AM, dc tech dctech1...@gmail.com wrote: See example below 1. Search for SUVs and boost Honda models q=suvboost=query({! v='honda'},1) 2. Search for SUVs and boost Honda OR toyota model a) Using OR in the query does NOT work q=suvboost=query({! v='honda

Re: Compressed Fields in 4.2.1

2013-04-04 Thread Yonik Seeley
On Thu, Apr 4, 2013 at 7:41 PM, Jamie Johnson jej2...@gmail.com wrote: I had read somewhere that text fields by default were compressed in 4.2.1, is this the case? If not how do I enable compression of stored text fields? Compressed stored fields are the default since 4.1 -Yonik

Re: Nested queries with proximity/slop

2013-03-21 Thread Yonik Seeley
https://issues.apache.org/jira/browse/SOLR-4625 -Yonik http://lucidworks.com On Tue, Mar 19, 2013 at 11:12 PM, Yonik Seeley yo...@lucidworks.com wrote: On Tue, Mar 19, 2013 at 8:52 PM, Michael Ryan mr...@moreover.com wrote: I was wondering if anyone is aware of an existing Jira for this bug

Re: Nested queries with proximity/slop

2013-03-19 Thread Yonik Seeley
On Tue, Mar 19, 2013 at 8:52 PM, Michael Ryan mr...@moreover.com wrote: I was wondering if anyone is aware of an existing Jira for this bug... _query_:\a b\~2 ...is parsed as... PhraseQuery(someField:a b) ...instead of the expected... PhraseQuery(someField:a b~2) _query_:\a b\~2 ...is

Re: NPE when adding docs in 4.2

2013-03-16 Thread Yonik Seeley
On Sat, Mar 16, 2013 at 11:36 AM, J Mohamed Zahoor jmo...@gmail.com wrote: aahha… i used a replication factor of 0. I thought 0 means no replication of original.. Should that be 1 if i want no replication? Think of it as the number of copies of a book at a library. replicationFactor is the

Re: Is Lucene's DrillSideways something suitable for Solr?

2013-03-12 Thread Yonik Seeley
On Tue, Mar 12, 2013 at 10:27 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Lucene seems to get a new DrillSideways functionality on top of its own facet implementation. I would love to have something like that in Solr Solr has had multi-select faceting for 4 years now. My understanding

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Yonik Seeley
On Wed, Mar 6, 2013 at 7:50 PM, Chris Hostetter hossman_luc...@fucit.org wrote: 2) If you wish to use the /schema REST API for read and write operations, then schema information will be persisted under the covers in a data store whose format is an implementation detail just like the index file

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Yonik Seeley
On Mon, Mar 11, 2013 at 2:50 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : 2) If you wish to use the /schema REST API for read and write operations, : then schema information will be persisted under the covers in a data store : whose format is an implementation detail just like the

Re: Dynamic schema design: feedback requested

2013-03-11 Thread Yonik Seeley
On Mon, Mar 11, 2013 at 5:51 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I guess my main point is, we shouldn't decide a priori that using the : API means you can no longer hand edit. and my point is we should build a feature where solr has the ability to read/write some piece of

Re: Distributed Search and the Stale Check

2013-02-25 Thread Yonik Seeley
On my particular benchmark rig, each stale check call accounted for an additional ~10ms. That's insane! It's still not even clear to me how the stale check works (reliably). Couldn't the server still close the connection between the stale check and the send of data by the client? -Yonik

Re: Field collapsing bad performances, schema redesign

2013-02-04 Thread Yonik Seeley
On Mon, Feb 4, 2013 at 10:34 AM, Mickael Magniez mickaelmagn...@gmail.com wrote: group.ngroups=true This is currently very inefficient - if you can live without retrieving the total number of groups, performance should be much better. -Yonik http://lucidworks.com

Re: Solr4.1 changing result order FIFO to LIFO

2013-02-03 Thread Yonik Seeley
On Sun, Feb 3, 2013 at 7:46 AM, Erick Erickson erickerick...@gmail.com wrote: Nope. Problem is that the tie breaker is the internal Lucene Doc id. Which a long time ago was invariant, that is a document indexed later always had a larger internal doc id. But the various merge policies can

Re: Join across cores on same shard.

2013-02-02 Thread Yonik Seeley
On Sat, Feb 2, 2013 at 5:49 AM, Marcin Rzewucki mrzewu...@gmail.com wrote: I meant I get fields from parent core only. Is it possible to get fields from both cores using join query? Not yet. Joins are currently only for filtering. -Yonik http://lucidworks.com

Re: Join across cores on same shard.

2013-02-01 Thread Yonik Seeley
You're missing the query to do the join on: fq={!join from=parent_id to=child_id fromIndex=core2}*:* We should have a better error message rather than a NPE of course... -Yonik http://lucidworks.com On Fri, Feb 1, 2013 at 3:45 PM, Marcin Rzewucki mrzewu...@gmail.com wrote: Check below if

Re: Solr 4.1.0 index leaving write.lock file

2013-02-01 Thread Yonik Seeley
On Fri, Feb 1, 2013 at 5:41 PM, dm_tim dm_...@yahoo.com wrote: I've been using Solr 4.1.0 for a little while now and I just noticed that when I index any core I have the write.lock file doesn't go away until I stop the server where solr is running. Sounds like it's working as it should. The

Re: expert question about SolrReplication

2013-02-01 Thread Yonik Seeley
On Fri, Feb 1, 2013 at 4:13 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: A question to the experts, why is the replicated index copied from its temporary location (index.x) to the real index directory and NOT moved? The intent is certainly to move and not copy (provided

Re: queryResultCache *very* low hit ratio

2013-01-29 Thread Yonik Seeley
One other thing that some auto-warming of the query result cache can achieve is loading FieldCache entries for sorting / function queries so real user queries don't experience increased latency. If you remove all auto-warming of the query result cache, you may want to add static warming entries

Re: Solr 4.1 Custom Hashing DIH

2013-01-25 Thread Yonik Seeley
On Fri, Jan 25, 2013 at 1:56 PM, davers dboych...@improvementdirect.com wrote: When I used 4.0 I could use my DIH on any shard and the documents would be distributed based on the internal hashing algorithm and end up distributed evenly across my three shards. I have just upgraded to Solr 4.1

Re: Solr 4.1 Custom Hashing DIH

2013-01-25 Thread Yonik Seeley
On Fri, Jan 25, 2013 at 3:59 PM, davers dboych...@improvementdirect.com wrote: I want to shard on groupid instead of id but it doesn't seem to be working. That's not yet implemented. Currently you need to put the group in the ID. From the release notes: * Simple multi-tenancy through enhanced

Re: Solr 4.1 Custom Hashing DIH

2013-01-25 Thread Yonik Seeley
On Fri, Jan 25, 2013 at 4:09 PM, davers dboych...@improvementdirect.com wrote: I'm not sure I understand. I thought ID had to be unique. Right - the group becomes part of the ID (the prefix), not the whole ID. for example I have the following [ { id : 1, groupid : 1 }, { id : 2, groupid :

JSON query syntax

2013-01-24 Thread Yonik Seeley
Although lucene syntax tends to be quite concise, nice looking, and easy to build by hand (the web browser is a major debugging tool for me), some people prefer to use a more structured query language that's easier to build up programmatically. XML fits the bill, but people tend to prefer JSON

Re: JSON query syntax

2013-01-24 Thread Yonik Seeley
On Thu, Jan 24, 2013 at 8:55 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Yes, this is JSON, so right there it may be better, but for instance I see v here which to a regular human may not be as nice as value if that is what v stands for. One goal was to reuse the parsers/parameter

Re: JSON with order-preserving update commands?

2013-01-23 Thread Yonik Seeley
On Wed, Jan 23, 2013 at 9:50 AM, Craig Ching craigch...@gmail.com wrote: The problem I have is that JSON is not specified to preserve order of keys. JSON is a serialization format, and readers/writers can preserve order if they wish to. If you send JSON to solr in a specific order, that order

Re: Issues with docFreq/docCount on SolrCloud

2013-01-23 Thread Yonik Seeley
On Wed, Jan 23, 2013 at 6:15 PM, Markus Jelsma markus.jel...@openindex.io wrote: We need, and i think many SolrCloud users are going to need this as well, to make replica's don't deviate too much from eachother, because if they do documents are certainly going to jump positions. The

Re: SolrCloud index recovery

2013-01-22 Thread Yonik Seeley
On Tue, Jan 22, 2013 at 4:37 PM, Marcin Rzewucki mrzewu...@gmail.com wrote: Sorry, my mistake. I did 2 tests: in the 1st I removed just index directory and in 2nd test I removed both index and tlog directory. Log lines I've sent are related to the first case. So Solr could read tlog directory

Re: SolrCloud :: Distributed query processing

2013-01-18 Thread Yonik Seeley
Hopefully the explanation here will shed some light on this: https://issues.apache.org/jira/browse/SOLR-3912 -Yonik http://lucidworks.com On Fri, Jan 18, 2013 at 2:59 PM, Mishkin, Ernest ernest_mish...@mcgraw-hill.com wrote: Hello, I'm trying to reconcile my understanding of how distributed

Re: how to get abortOnConfigurationError=false working

2013-01-17 Thread Yonik Seeley
On Thu, Jan 17, 2013 at 3:40 PM, snake r...@michaels.me.uk wrote: Ok so is there any other to stop this problem I am having where any site can break solr by delering their collection? Seems odd everyone would vote to remove a feature that would make solr more stable. I agree.

Re: Solr exception when parsing XML

2013-01-16 Thread Yonik Seeley
On Tue, Jan 15, 2013 at 3:55 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Basically, the recommendation is to avoid CDATA and automatically encode characters such as yours, as well as less/more and ampersand. Unfortunately that doesn't even work. Just as a raw control character like a

Re: 400 error with boost and exists()

2013-01-16 Thread Yonik Seeley
On Wed, Jan 16, 2013 at 6:11 PM, Walter Underwood wun...@wunderwood.org wrote: I got the syntax from: http://lucidworks.lucidimagination.com/display/solr/Function+Queries Oops, I've alerted our tech writers! It should be fixed now. exists(field|function) returns true if a value exists for a

Re: 400 error with boost and exists()

2013-01-16 Thread Yonik Seeley
On Wed, Jan 16, 2013 at 6:35 PM, Walter Underwood wun...@wunderwood.org wrote: None of the variants worked. I started with that syntax for both exists() and if(). All gave the same stack trace. --wunder These boolean functions are new for 4.0, but it looks like you're using 3.3? -Yonik

Re: 400 error with boost and exists()

2013-01-16 Thread Yonik Seeley
On Wed, Jan 16, 2013 at 6:42 PM, Walter Underwood wun...@wunderwood.org wrote: Ah, that would be it. Does 4.0 also give a stack trace if you call a function that doesn't exist? Stack trace still appears in the logs, but the error message returned seems OK:

Re: SolrJ | Atomic Updates | How works exactly?

2013-01-13 Thread Yonik Seeley
On Sun, Jan 13, 2013 at 1:51 PM, Uwe Clement uwe.clem...@exxcellent.de wrote: What is the best the most performant way to update a large document? That *is* the best way to update a large document that we currently have. Although it re-indexes under the covers, it ensures that it's atomic, and

Re: Difference between IntField and TrieIntField in Lucene 4.0

2013-01-12 Thread Yonik Seeley
On Sat, Jan 12, 2013 at 4:56 PM, jefferyyuan yuanyun...@gmail.com wrote: Looked at Lucene Javadoc, seems we can run range query, filter, sorting on IntField. Also seems IntField is also indexed as trie structure. Javadoc for IntField: You're reading the javadoc for *lucene* IntField.

Re: parsing debug output for readability

2013-01-10 Thread Yonik Seeley
On Thu, Jan 10, 2013 at 6:16 PM, Petersen, Robert rober...@buy.com wrote: Thanks, debug.explain.structured=true helps a lot! Could you also tell me what these `#8;#0;#0;#0;#1; strings represent in the debug output? That's internally how a number is encoded into a string (5 bytes, the first

Re: Terminology question: Core vs. Collection vs...

2013-01-04 Thread Yonik Seeley
On Fri, Jan 4, 2013 at 2:26 AM, Per Steffensen st...@designware.dk wrote: Our biggest problem is that we really havent decided once and for all and made sure to reflect the decision consistently across code and documentation. As long as we havnt I believe it is still ok to change our minds.

Re: Terminology question: Core vs. Collection vs...

2013-01-04 Thread Yonik Seeley
On Fri, Jan 4, 2013 at 1:35 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Hmm. Doesn't that make (logical) index=collection? And (physical) index=core? Which creates duplication of terminology and at the same time can cause confusion between highest logical and lowest physical level.

Re: Solr Collection API doesn't seem to be working

2013-01-03 Thread Yonik Seeley
maxShardsPerNode = msgStrToInt(message, MAX_SHARDS_PER_NODE, 1); Remember than replicationFactor decides how many instances of you shard you will get, so a value of 1 does not provide you any replication. On 1/3/13 3:46 AM, Yonik Seeley wrote: On Wed, Jan 2, 2013 at 9:21 PM, davers dboych

Re: What is group.query?

2013-01-03 Thread Yonik Seeley
From http://wiki.apache.org/solr/FieldCollapsing Return a single group of documents that also match the given query. ''' We can find the top documents that also match arbitrary queries with the group.query command (much like facet.query). For example, we could use this to find the top 3

Re: Solr Collection API doesn't seem to be working

2013-01-02 Thread Yonik Seeley
On Wed, Jan 2, 2013 at 9:21 PM, davers dboych...@improvementdirect.com wrote: So by providing the correct replicationFactor parameter for the number of servers has fixed my issue. So can you not provide a higher replicationFactor than you have live_nodes? What if you want to add more

Re: order question on solr multi value field

2012-12-19 Thread Yonik Seeley
On Tue, Dec 18, 2012 at 8:24 PM, Robert Muir rcm...@gmail.com wrote: I agree with James. Actually lucene tests will fail if a codec violates this. Actually it goes much deeper than this. From the lucene apis, when you call IndexReader.document() with your storedfieldVisitor, it must visit

Re: Order SOLR 4 output

2012-12-18 Thread Yonik Seeley
On Tue, Dec 18, 2012 at 4:58 AM, roySolr royrutten1...@gmail.com wrote: Hello, I have a really simple question i think: What is the order of the fields that are in the SOLR response? In SOLR 3.1 it was alfabetic but in SOLR 4 it isn't anymore. Is it configurable? I want to know this

Re: Will SolrCloud always slice by ID hash?

2012-12-18 Thread Yonik Seeley
On Tue, Dec 18, 2012 at 2:20 PM, Scott Stults sstu...@opensourceconnections.com wrote: I'm going to be building a Solr cluster and I want to have a rolling set of slices so that I can keep a fixed number of days in my collection. If I send an update to a particular slice leader, will it always

Re: small QTime but slow results to user

2012-12-15 Thread Yonik Seeley
On Sat, Dec 15, 2012 at 12:04 PM, S L sol.leder...@gmail.com wrote: Thanks everyone for the responses. I did some more queries and watched disk activity with iostat. Sure enough, during some of the slow queries the disk was pegged at 100% (or more.) The requirement for the app I'm building

Re: small QTime but slow results to user

2012-12-15 Thread Yonik Seeley
On Sat, Dec 15, 2012 at 1:11 PM, S L sol.leder...@gmail.com wrote: My virtual machine has 6GB of RAM. Tomcat is currently configured to use 4GB of it. The size of the index is 5.4GB for 3 million records which averages out to 1.8KB per record. I can look at trimming the data, having fewer

Re: small QTime but slow results to user

2012-12-14 Thread Yonik Seeley
On Fri, Dec 14, 2012 at 3:43 PM, S L sol.leder...@gmail.com wrote: Does anyone have an idea why a query that takes solr just half a second (500 ms) to execute would take 3 seconds to transfer the data? Normally this is due to slow reading of the stored fields (i.e. slow disk IO). For

Re: The shard called `properties`

2012-12-13 Thread Yonik Seeley
it right. {collection1: { config : myconf router : compositeId, shards : { shard1 : {... -Yonik http://lucidworks.com - mark On Dec 6, 2012, at 8:16 AM, Yonik Seeley yo...@lucidworks.com wrote: On Wed, Dec 5, 2012 at 5:17 PM, Mark Miller markrmil...@gmail.com wrote: See

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-12 Thread Yonik Seeley
On Wed, Dec 12, 2012 at 5:03 PM, sausarkar sausar...@ebay.com wrote: We still could replicate the issue in 4.1 branch i.e. queries going to one server (numShards=1) is being distributed among all the servers which is creating CPU spikes in all the servers in the cloud. Do you think this

Re: Sort speed asc vs desc - is desc slower?

2012-12-12 Thread Yonik Seeley
On Wed, Dec 12, 2012 at 5:49 PM, Michael Ryan mr...@moreover.com wrote: When sorting a TrieLongField, should there be any expected difference in query speed when sorting ascending vs sorting descending? I'm seeing desc queries sometimes take 10x longer than asc queries. I can provide more

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 8:08 PM, sausarkar sausar...@ebay.com wrote: Ok we think we found out the issue here. When solrcloud is started without specifying numShards argument solrcloud starts with a single shard but still thinks that there are multiple shards, so it forwards every single query to

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread Yonik Seeley
OK, I tried to reproduce it on trunk, and I can't (i.e. everything is looking fine). rm -rf example/solr/zoo_data cp -rp example example2 cp -rp example example3 cd example java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar cd

Re: SOLR4 (sharded) and join query

2012-12-09 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 6:47 PM, Erick Erickson erickerick...@gmail.com wrote: see: http://wiki.apache.org/solr/DistributedSearch joins aren't supported in distributed search. Any time you have more than one shard in SolrCloud, you are, by definition, doing distributed search. It is supported,

Re: Minimum HA Setup with SolrCloud

2012-12-06 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 9:56 AM, Markus Jelsma markus.jel...@openindex.io wrote: The quorum is the minimun, so it depends on how many you have running in the ensemble. If it's three or four, then two is the quorum I think that for 4 ZK servers, then 3 would be the quorum? -Yonik

Re: The shard called `properties`

2012-12-06 Thread Yonik Seeley
On Wed, Dec 5, 2012 at 5:17 PM, Mark Miller markrmil...@gmail.com wrote: See the custom hashing issue - the UI has to be updated to ignore this. Unfortunately, it seems that clients have to be hard coded to realize properties is not a shard unless we add another nested layer. Yeah, I talked

Re: Minimum HA Setup with SolrCloud

2012-12-06 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 5:21 PM, Jack Krupansky j...@basetechnology.com wrote: If 1 is the minimum, what is the 3 minimum all about? The minimum for running an ensemble (a cluster) and having any sort of fault tolerance? The zk web page does say Three ZooKeeper servers is the minimum

Re: Minimum HA Setup with SolrCloud

2012-12-06 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 5:55 PM, Jack Krupansky j...@basetechnology.com wrote: I trust that you have the right answer, Mark, but maybe I'm just struggling to parse this statement: the remaining two machines do not constitute a majority. If you start with 3 zk and lose one, you have an ensemble

Re: Solr 4 : Optimize very slow

2012-12-06 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 12:17 PM, Sandeep Mestry sanmes...@gmail.com wrote: I followed the advice Michael and the timings reduced to couple of hours now from 6-8 hours :-) Just changing from mmap to NIO, eh? What does your system look like? operating system, JVM, drive, memory, etc? -Yonik

Re: Minimum HA Setup with SolrCloud

2012-12-06 Thread Yonik Seeley
On Thu, Dec 6, 2012 at 8:42 PM, Jack Krupansky j...@basetechnology.com wrote: And this is precisely why the mystery remains - because you're only describing half the picture! Describe the rest of the picture - including what exactly those two zks can and can't do, including resolution of ties

Re: Solr Query Parameter : ids - What is this used for?

2012-12-03 Thread Yonik Seeley
On Mon, Dec 3, 2012 at 10:55 PM, deniz denizdurmu...@gmail.com wrote: Hello, as it is clear in the title too, i wanna know for what solr uses this parameter... i see it on a sharding env on cloud, so i guess it is related with cloud but still there is no explanation about it in any of wiki

Re: Solr 4, optimizing while doing other updates?

2012-11-27 Thread Yonik Seeley
On Tue, Nov 27, 2012 at 3:21 PM, Shawn Heisey s...@elyograg.org wrote: but even way back then, rumblings on the mailing list said don't optimize for performance reasons. Count me amongst the dissenters. Optimize can make a lot of sense, and that's why it still exists. People should be

Re: SynonymFilterFactory breaking WordDelimiterFilterFactory output

2012-11-23 Thread Yonik Seeley
Sounds like perhaps the SynonymFilter is losing the positionIncrement of 0 (which make the first two tokens overlap)? You could perhaps verify with the analysis debugging on the admin page. -Yonik http://lucidworks.com On Tue, Nov 20, 2012 at 10:55 PM, Chris Book chrisb...@gmail.com wrote:

Re: SolrCloud and exernal file fields

2012-11-22 Thread Yonik Seeley
On Tue, Nov 20, 2012 at 4:16 AM, Martin Koch m...@issuu.com wrote: around 7M documents in the index; each document has a 45 character ID. 7M documents isn't that large. Is there a reason why you need so many shards (16 in your case) on a single box? -Yonik http://lucidworks.com

Re: sort by function error

2012-11-12 Thread Yonik Seeley
On Mon, Nov 12, 2012 at 5:24 AM, Kuai, Ben ben.k...@sensis.com.au wrote: more information, problem only happends when I have both sort by function and grouping in query. I haven't been able to duplicate this with a few ad-hoc queries. Could you give your complete request (or at least all of

Re: How to speed up Facet count (Big index) ??!!!!

2012-11-12 Thread Yonik Seeley
On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox aero...@gmail.com wrote: Hi folks, I have a solr index with up to 50M documents. A document contain 62 fields (docid, name, location). The facet count took 1 to 2 minutes with this params : http://.../select/?q=solr;

Re: Is leading wildcard search turned on by default in Solr 3.6.1?

2012-11-12 Thread Yonik Seeley
On Tue, Nov 13, 2012 at 2:27 AM, johnmu...@aol.com wrote: I'm surprised that this has not been logged as adefect. The fact that this is ON bydefault, means someone can bring down a server; this is bad enough to categorizethis as a security issue. It's all relative. There are tons of

Re: sort by function error

2012-11-12 Thread Yonik Seeley
. Ben From: ysee...@gmail.com [ysee...@gmail.com] on behalf of Yonik Seeley [yo...@lucidworks.com] Sent: Tuesday, November 13, 2012 6:46 AM To: solr-user@lucene.apache.org Subject: Re: sort by function error On Mon, Nov 12, 2012 at 5:24 AM, Kuai, Ben

Re: zkcli issues

2012-11-11 Thread Yonik Seeley
On Sun, Nov 11, 2012 at 10:39 PM, Nick Chase nch...@earthlink.net wrote: So I'm trying to use ZkCLI without success. I DID start and stop Solr in non-cloud mode, so everything is extracted and it IS finding zookeeper*.jar. However, now it's NOT finding SolrJ. Not sure about your specific

Re: indexing CSV using Solr 3.6.1

2012-11-10 Thread Yonik Seeley
My guess is that this might have to do with the fact that you are on Windows, and shell escaping is different (i.e. curl isn't getting all of the parameters and hence isn't sending everything to Solr). My first recommendation would be to install cygwin to get a UNIX command line environment like

Re: SolrCloud and distributed search

2012-10-26 Thread Yonik Seeley
On Fri, Oct 26, 2012 at 10:14 AM, Bill Au bill.w...@gmail.com wrote: I am currently using one master with multiple slaves so I do have high availability for searching now. My index does fit on a single machine and a single query does not take too long to execute. But I do want to take

Re: Occasional Solr performance issues

2012-10-22 Thread Yonik Seeley
On Mon, Oct 22, 2012 at 4:39 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: Has the Solr team considered renaming the optimize function to avoid leading people down the path of this antipattern? If it were never the right thing to do, it could simply be removed. The problem is

Re: Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?

2012-10-21 Thread Yonik Seeley
On Sun, Oct 21, 2012 at 3:57 PM, Aaron Daubman daub...@gmail.com wrote: Greetings, I'm wondering if somebody would please explain why SolrIndexSearcher.java enforces mutual exclusion of filter and filterList (e.g. see:

Re: differences of LockFactory between solr 3.6.1 and 4.0.0?

2012-10-17 Thread Yonik Seeley
On Wed, Oct 17, 2012 at 9:33 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: Hi list, while checking the runtime behavior of solr 4.0.0 I recognized that the handling of write.lock seams to be different. With solr 3.6.1 after calling optimize the index is optimzed and write.lock

Re: Why is SolrDispatchFilter using 90% of the Time?

2012-10-10 Thread Yonik Seeley
When I look at the distribution of the Response-time I notice 'SolrDispatchFilter.doFilter()' is taking up 90% of the time. That's pretty much the top-level entry point to Solr (from the servlet container), so it's normal. -Yonik http://lucidworks.com

Re: solr facet !tag on multiple columns

2012-10-03 Thread Yonik Seeley
On Wed, Oct 3, 2012 at 11:04 AM, lavesh lavesh.ra...@gmail.com wrote: I know this is possible in Solr which do the grouping irrespective of one values. i.e below line do the grouping based on column1 considering all filters except the column column1 facet.field={!ex=column1}column1 now i

Re: Items disappearing from Solr index

2012-09-26 Thread Yonik Seeley
On Wed, Sep 26, 2012 at 10:45 AM, Shawn Heisey s...@elyograg.org wrote: On 9/26/2012 5:47 AM, Kissue Kissue wrote: getSolrServer().deleteByQuery(catalogueId + : + Emory Labs) [Notice that there are no quotes surrounding the catalogueId value - Emory Labs] How did you even get this Java

Re: At a high level how does faceting in SolrCloud work?

2012-09-26 Thread Yonik Seeley
On Wed, Sep 26, 2012 at 6:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: 2) the coordinator node sums up the counts for any constraint returned by multiple nodes, and then picks the top (facet.limit) constraints based n the counts it knows about. It's actually more sophisticated than

Re: solrcloud and csv import hangs

2012-09-24 Thread Yonik Seeley
On Mon, Sep 24, 2012 at 11:03 AM, dan sutton danbsut...@gmail.com wrote: Hi, This appears to happen in trunk too. It appears that the add command request parameters get sent to the nodes. If I comment these out like so for add and commit:

Re: solrcloud and csv import hangs

2012-09-24 Thread Yonik Seeley
https://issues.apache.org/jira/browse/SOLR-3883 -Yonik http://lucidworks.com On Mon, Sep 24, 2012 at 11:42 AM, Yonik Seeley yo...@lucidworks.com wrote: On Mon, Sep 24, 2012 at 11:03 AM, dan sutton danbsut...@gmail.com wrote: Hi, This appears to happen in trunk too. It appears that the add

<    1   2   3   4   5   6   7   8   9   10   >