solrcloud and csv import hangs

2012-09-24 Thread dan sutton
file if exists on a replicant. This might not be the right thing to do? ... what should be sent here for a streaming CSV import? Dan On Thu, Sep 20, 2012 at 4:32 PM, dan sutton danbsut...@gmail.com wrote: Hi, I'm using Solr 4.0-BETA and trying to import a CSV file as follows: curl http

solrcloud and csv import hangs

2012-09-20 Thread dan sutton
Hi, I'm using Solr 4.0-BETA and trying to import a CSV file as follows: curl http://localhost:8080/solr/core/update -d overwrite=false -d commit=true -d stream.contentType='text/csv;charset=utf-8' -d stream.url=file:///dir/file.csv I have 2 tomcat servers running on different machines and a

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread dan sutton
Hi, If like most people you have application server(s) in front of solr, the simplest and most secure option is to bind solr to a local address (192.168.* or 10.0.0.*). The app server talks to solr via the local (a.k.a blackhole) ip address that no-one from outside can ever access as it's not

Solr Cloud partitioning

2012-09-05 Thread dan sutton
Hi, At the moment, partitioning with solrcloud is hash based on uniqueid. What I'd like to do is have custom partitioning, e.g. based on date (shard_MMYY). I'm aware of https://issues.apache.org/jira/browse/SOLR-2592, but after a cursory look it seems that with the latest patch, one might end up

flashcache and solr/lucene

2012-03-01 Thread dan sutton
Hi, Just wondering if anyone had any experience with solr and flashcache [https://wiki.archlinux.org/index.php/Flashcache], my guess it might be particularly useful for indicies not changing that often, and for large indicies where an SSD of that size is prohibitive. Cheers, Dan

Solr Warm-up performance issues

2012-01-27 Thread dan sutton
Hi List, We use Solr 4.0.2011.12.01.09.59.41 and have a dataset of roughly 40 GB. Every day we produce a new dataset of 40 GB and have to switch one for the other. Once the index switch over has taken place, it takes roughly 30 min for Solr to reach maximum performance. Are there any hardware or

Re: How to return exact set of multivalue field

2011-10-20 Thread dan sutton
-field_name:[ * TO 384] +field_name:[385 TO 386] -field_name:[387 TO *] On Thu, Oct 20, 2011 at 10:51 AM, Ellery Leung elleryle...@be-o.com wrote: Hi all I am using Solr 3.4 on Windows 7. Here is the example of a multivalue field: doc arr name=field_name str387/str str386/str

Distributed Search question/feedback

2011-09-22 Thread dan sutton
Hi, Does SolrCloud use Distributed search as described http://wiki.apache.org/solr/DistributedSearch or is it different entirely? Does SolrCloud suffer from the same limitation as Distributed search (inefficient to use a high start parameter, and presumably high CPU highlighting all those docs

logging client ip address

2011-09-07 Thread dan sutton
Hi, We're using log4j with solr which is working fine and I'm wondering how I might be able to log the client ip address? Has anyone else been able to do this? Cheers, Dan

Re: logging client ip address

2011-09-07 Thread dan sutton
Does anyone know how I would be able to include the client ip address for tomcat 6 with log4j? Cheers, Dan On Wed, Sep 7, 2011 at 11:03 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Sep 7, 2011 at 2:56 PM, dan sutton danbsut...@gmail.com wrote: Hi, We're using log4j

replication/search on separate LANs

2011-06-10 Thread dan sutton
Hi All, I'm wondering if anyone had experience on replicating and searching over separate LANs? currently we do both over the same one. So each slave would have 2 Ethernet cards, 1/LAN and the master just one. We're currently building and replicating a daily index, this is quite large about 15M

custom highlighting

2011-05-24 Thread dan sutton
Hi, I'd like to make the highlighting work as follows: length(all snippits) approx. 200 chars hl.snippits = 2 (2 snippits) e.g. if there is onyl 1 snippet available, length = 200chars e.g. if there is 1 snippet, length each snippet == 100chars, so I take the first 2 and get 200 chars Is this

Suggester and query/index analysis

2011-05-17 Thread dan sutton
Hi All, I understand that I can use a custom queryConverter for the input to the suggester http://wiki.apache.org/solr/Suggester component, however there dosen't seem to be anything on the indexing side, TST appears to take the input verbatim, and Jaspell seems to lowercase everything. The

Enable/disable mainIndex component

2011-05-11 Thread dan sutton
Hi, Does anyone know if I can do the following: mainIndex enable=${enable.master:false} mergeFactor10/mergeFactor ... /mainIndex mainIndex enable=${enable.slave:true} mergeFactor2/mergeFactor ... /mainIndex Cheers, Dan

Highlighting and custom fragmenting

2011-04-07 Thread dan sutton
Hi All, I'd like to make the highlighting work as follows: length(all snippits) approx. 200 chars hl.snippits = 2 (2 snippits) is this possible with the regex fragmenter? or does anyone know of any contrib fragmenter that might do this? Many thanks Dan

Re: Math-generated fields during query

2011-03-10 Thread dan sutton
As a workaround can you not have a search component run after the querycomponent, and have the qty_ordered,unit_price as stored fields and returned with the fl parameter and have your custom component do the calc, unless you need to sort by this value too? Dan On Wed, Mar 9, 2011 at 10:06 PM,

Split analysis

2011-03-02 Thread dan sutton
Hi All, I have a requirement to analyze a field with a series of filters, calculate a 'signature' then concatenate with the original input e.g. input = 'this is the input' tokenized and filtered, input becomes say 'this input' = 12ef5e (signature) so the final output indexed is:

Re: Replication and newSearcher registerd poll interval

2011-02-17 Thread dan sutton
Hi, Keeping the thread alive, any thought on only doing replication if there is no warming currently going on? Cheers, Dan On Thu, Feb 10, 2011 at 11:09 AM, dan sutton danbsut...@gmail.com wrote: Hi, If the replication window is too small to allow a new searcher to warm and close

Replication and newSearcher registerd poll interval

2011-02-10 Thread dan sutton
Hi, If the replication window is too small to allow a new searcher to warm and close the current searcher before the new one needs to be in place, then the slaves continuously has a high load, and potentially an OOM error. we've noticed this in our environment where we have several facets on

Re: facet.mincount

2011-02-03 Thread dan sutton
I don't think facet.mincount works with date faceting, see here: http://wiki.apache.org/solr/SimpleFacetParameters Dan On Thu, Feb 3, 2011 at 10:11 AM, Isan Fulia isan.fu...@germinait.com wrote: Any query followed by

Re: facet.mincount

2011-02-03 Thread dan sutton
)        at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) On 3 February 2011 16:17, dan sutton danbsut...@gmail.com wrote: I don't think facet.mincount works with date faceting, see here: http://wiki.apache.org/solr/SimpleFacetParameters Dan On Thu, Feb 3

EmbeddedSolrServer and junit

2011-01-31 Thread dan sutton
Hi, I have 2 cores CoreA and CoreB, when updating content on CoreB, I use solrj and EmbeddedSolrServer to query CoreA for information, however when I do this with my junit tests (which also use EmbeddedSolrServer to query) I get this error SEVERE: Previous SolrRequestInfo was not closed!

Re: EmbeddedSolrServer and junit

2011-01-31 Thread dan sutton
to call queryAndResponse(String handler, SolrQueryRequest req) instead which does not set/clear SolrRequestInfo Regards, Dan On Mon, Jan 31, 2011 at 2:32 PM, dan sutton danbsut...@gmail.com wrote: Hi, I have 2 cores CoreA and CoreB, when updating content on CoreB, I use solrj

solr equiv of : SELECT count(distinct(field)) FROM index WHERE length(field) 0 AND other_criteria

2010-12-22 Thread dan sutton
Hi, Is there a way with faceting or field collapsing to do the SQL equivalent of SELECT count(distinct(field)) FROM index WHERE length(field) 0 AND other_criteria i.e. I'm only interested in the total count not the individual records and counts. Cheers, Dan

JMX Cache values are wrong

2010-11-18 Thread dan sutton
Hi, I've used three different JMX clients to query solr/core:id=org.apache.solr.search.FastLRUCache,type=queryResultCache and solr/core:id=org.apache.solr.search.FastLRUCache,type=documentCache beans and they appear to return old cache information. As new searchers come online, the newer

Re: spatial sorting

2010-09-30 Thread dan sutton
calculation) Does anyone know if it's possible to return the distance and score separately? I know there has been a patch to sort by value function, but how can one return the values from this? Cheers, Dan On Fri, Sep 17, 2010 at 2:45 PM, dan sutton danbsut...@gmail.com wrote: Hi, I'm

multiple spatial values

2010-09-21 Thread dan sutton
Hi, I was looking at the LatLonType and how it might represent multiple lon/lat values ... it looks to me like the lat would go in {latlongfield}_0_LatLon and the long in {latlongfield}_1_LatLon ... how then if we have multiple lat/long points for a doc when filtering for example we choose the

Re: how to normalize a query

2010-09-09 Thread dan sutton
September 2010 15:08:41 dan sutton wrote: Hi, Does anyone know how I might normalized a query so that e.g. q=one two equals q=two one Cheers, Dan Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Auto Suggest

2010-09-03 Thread dan sutton
I set this up a few years ago with something like the following: fieldType name=autocomplete class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory /

Re: Spellchecking and frequency

2010-07-28 Thread dan sutton
it useful (although it's a bit crude at the moment and will need a decent tidy up). Would it be appropriate to open up a Jira issue for this? Cheers, ~mark On 27 July 2010 09:33, dan sutton danbsut...@gmail.com wrote: Hi, I've recently been looking into Spellchecking in solr, and was struck

Spellchecking and frequency

2010-07-27 Thread dan sutton
Hi, I've recently been looking into Spellchecking in solr, and was struck by how limited the usefulness of the tool was. Like most corpora , ours contains lots of different spelling mistakes for the same word, so the 'spellcheck.onlyMorePopular' is not really that useful unless you click on it

Re: why spellcheck and elevate search components can't work together?

2010-07-19 Thread dan sutton
It needs to be : arr name=last-components strspellcheck/str strelevateListings/str /arr or arr name=last-components strelevateListings/str strspellcheck/str /arr Dan On Mon, Jul 19, 2010 at 11:14 AM, Chamnap Chhorn chamnapchh...@gmail.comwrote: In my

Re: Custom comparator

2010-07-16 Thread dan sutton
On Thu, Jul 15, 2010 at 10:02 AM, dan sutton danbsut...@gmail.com wrote: Hi, I have a requirement to have a custom comparator that keep the top N documents (chosen by some criteria) but only if their score is more then e.g. 1% of the maxScore. Looking at SolrIndexSearcher.java, I

Custom comparator

2010-07-15 Thread dan sutton
Hi, I have a requirement to have a custom comparator that keep the top N documents (chosen by some criteria) but only if their score is more then e.g. 1% of the maxScore. Looking at SolrIndexSearcher.java, I was hoping to have a custom TopFieldCollector.java to return these via

Re: Help with highlighting

2010-06-23 Thread dan sutton
It looks to me like a tokenisation issue, all_text content and the query text will match, but the string fieldtype fields 'might not' and therefore will not be highlighted. On Wed, Jun 23, 2010 at 4:40 PM, n...@frameweld.com wrote: Here's my request:

fl and nulls

2010-05-26 Thread dan sutton
Hi, In Solr 1.3 it looks like null fields were returned if requested with the fl param,, whereas with solr 1.4, nulls are omitted entirely. Is there a way to have the nulls returned with Solr 1.4 e.g. ... doc field1/ field2/ /doc Cheers, Dan

Dynamic analyzers

2010-05-24 Thread dan sutton
Hi, I have a requirement to dynamically choose a fieldType to analyze text in multiple languages. I will know the language (in a separate field) at index and query time. I've tried implementing this with a custom UpdateRequestProcessorFactory and custom DocumentBuilder.toDocument to change the

Custom sorting

2010-05-19 Thread dan sutton
Hi, I have a requirement to do the following: For up to the first 10 results (i.e. only on the first page) show sponsored category ads, in order of bid, but no more than 2 / category, and only if all sponsored cat' ads are more that min% of the highest score. e.g. If I had the following: min%