Re: Query with exact number of tokens

2018-09-21 Thread Michael Kuhlmann
Hi Sergio, alas that's not possible that way. If you search for CENTURY BANCORP, INC., then Solr will be totally happy to find all these terms in "NEW CENTURY BANCORP, INC." and return it with a high score. But you can prepare your data at index time. Make it a multivalued field of type string

Re: How to split index more than 2GB in size

2018-06-21 Thread Michael Kuhlmann
Hi Sushant, while this is true in general, it won't hold here. If you split your index, searching on each splitted shard might be a bit faster, but you'll increase search time much more because Solr needs to send your search queries to all shards and then combine the results. So instead of having

Re: SolrException undefined field *

2018-01-09 Thread Michael Kuhlmann
help you better when you pass the full query string (if you're able to fetch it). -Michael Am 09.01.2018 um 16:38 schrieb Michael Kuhlmann: > First, you might want to index, but what Solr is executing here is a > search request. > > Second, you're querying for a dynamic field

Re: SolrException undefined field *

2018-01-09 Thread Michael Kuhlmann
First, you might want to index, but what Solr is executing here is a search request. Second, you're querying for a dynamic field "*" which is not defined in your schema. This is quite obvious, the exception says right this. So whatever is sending the query (some client, it seems) is doing the

Re: Edismax leading wildcard search

2017-12-22 Thread Michael Kuhlmann
Am 22.12.2017 um 11:57 schrieb Selvam Raman: > 1) how can i disable leading wildcard search Do it on the client side. Just don't allow leading asterisks or question marks in your query term. > 2) why leading wildcard search takes so much of time to give the response. > Because Lucene can't

Re: How to sort on dates?

2017-12-18 Thread Michael Kuhlmann
Am 16.12.2017 um 19:39 schrieb Georgios Petasis: > Even if the DateRangeField field can store a range of dates, doesn't > Solr understand that I have used single timestamps? No. It could theoretically, but sorting just isn't implemented in DateRangeField. > I have even stored the dates. > My

Re: Wildcard searches with special character gives zero result

2017-12-15 Thread Michael Kuhlmann
Solr does not analyze queries with wildcards in it. So, with ch*p-seq, it will search for terms that start with ch and end with p-seq. Since your indexer has analyzed all tokens before, only chip and seq are in the index. See

Re: How to sort on dates?

2017-12-15 Thread Michael Kuhlmann
Hi Georgios, DateRangeField is a kind of SpatialField which is not sortable at all. For sorting, use a DatePointField instead. It's not deprecated; the deprecated class is TrieDateField. Best, Michael Am 15.12.2017 um 10:53 schrieb Georgios Petasis: > Hi all, > > I have a field of type

Re: Newbie question about why represent timestamps as "float" values

2017-10-10 Thread Michael Kuhlmann
While you're generally right, in this case it might make sense to stick to a primitive type. I see "unixtime" as a technical information, probably from System.currentTimeMillis(). As long as it's not used as a "real world" date but only for sorting based on latest updates, or chosing which

Re: Where the uploaded configset from SOLR into zookeeper ensemble resides?

2017-09-28 Thread Michael Kuhlmann
Do you find your configs in the Solr admin panel, in the Cloud --> Tree folder? -Michael Am 28.09.2017 um 04:50 schrieb Gunalan V: > Hello, > > Could you please let me know where can I find the uploaded configset from > SOLR into zookeeper ensemble ? > > In docs it says they will "/configs/"

Re: Modifing create_core's instanceDir attribute

2017-09-28 Thread Michael Kuhlmann
I'd rather say you didn't quote the URL when sending it using curl. Bash accepts the ampersand as a request to execute curl including the URL up to CREATE in background - that's why the error is included within the next output, followed by "Exit" - and then tries to execute the following part of

Re: Moving to Point, trouble with IntPoint.newRangeQuery()

2017-09-26 Thread Michael Kuhlmann
Arrgh, forget my question. I just see that newExactQuery() simply triggers newRangeQuery() like you already do. -Michael Am 26.09.2017 um 13:29 schrieb Michael Kuhlmann: > Hi Markus, > > I don't know why there aren't any results. But just out of curiosity, > why don't you use the b

Re: Moving to Point, trouble with IntPoint.newRangeQuery()

2017-09-26 Thread Michael Kuhlmann
Hi Markus, I don't know why there aren't any results. But just out of curiosity, why don't you use the better choice IntPoint.newExectQuery(String,int)? What happens if you use that? -Michael Am 26.09.2017 um 13:22 schrieb Markus Jelsma: > Hello, > > I have a QParser impl. that transforms

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-22 Thread Michael Kuhlmann
Hi Shamik, funny enough, we had a similar issue with our old legacy application that still used plain Lucene code in a JBoss container. Same, there were no specific queries or updates causing this, the performance just broke completely without unusual usage. GC was raising up to 99% or so.

Re: solr Facet.contains

2017-09-15 Thread Michael Kuhlmann
What is the field type? Which Analyzers are configured? How do you split at "~"? (You have to do it by yourself, or configure some tokenizer for that.) What do you get when you don't filter your facets? What do you mean with "it is not working"? What is your result now? -Michael  Am 15.09.2017

Re: ways to check if document is in a huge search result set

2017-09-13 Thread Michael Kuhlmann
Am 13.09.2017 um 04:04 schrieb Derek Poh: > Hi Michael > > "Then continue using binary search depending on the returned score > values." > > May I know what do you mean by using binary search? An example algorithm is in Java method java.util.Arrays::binarySearch. Or more detailed:

Re: ways to check if document is in a huge search result set

2017-09-12 Thread Michael Kuhlmann
So you're looking for a solution to validate the result output. You have two ways: 1. Assuming you're sorting by the default "score" sort option: Find the result you're looking for by setting the fq filter clause accordingly, and add "score" the the fl field list. Then do the normal unfiltered

Re: ways to check if document is in a huge search result set

2017-09-11 Thread Michael Kuhlmann
Maybe I don't understand your problem, but why don't you just filter by "supplier information"? -Michael Am 11.09.2017 um 04:12 schrieb Derek Poh: > Hi > > I have a collection of productdocument. > Each productdocument has supplier information in it. > > I need to check if a supplier's products

Re: Solr Issue

2017-09-07 Thread Michael Kuhlmann
Hi Patrick, can you attach the query you're sending to Solr and one example result? Or more specific, what are your hl.* parameters? -Michael Am 07.09.2017 um 09:36 schrieb Patrick Fallert: > > Hey Guys,  > i´ve got a problem with my Solr Highlighter.. > When I search for a word, i get some

Re: Solr6.6 Issue/Bug

2017-09-06 Thread Michael Kuhlmann
Why would you need to start Solr as root? You should definitely not do this, there's no reason for that. And even if you *really* want this: What's so bad about the -force option? -Michael Am 06.09.2017 um 07:26 schrieb Kasim Jinwala: > Dear team, > I am using solr 5.0 last 1 year,

Re: Error after moving index

2017-06-22 Thread Michael Kuhlmann
Hi Moritz, did you stop your local Solr sever before? Copying data from a running instance may cause headaches. If yes, what happens if you copy everything again? It seems that your copy operations wasn't successful. Best, Michael Am 22.06.2017 um 14:37 schrieb Moritz Munte: > Hello, > > > >

Re: Solr NLS custom query parser

2017-06-15 Thread Michael Kuhlmann
Hi Arun, your question is too generic. What do you mean with nlp search? What do you expect to happen? The short answer is: No, there is no such parser because the individual requirements will vary a lot. -Michael Am 14.06.2017 um 16:32 schrieb aruninfo100: > Hi, > > I am trying to configure

Re: Mixing AND OR conditions with query parameters

2017-04-24 Thread Michael Kuhlmann
Make sure to have a whitespace are the OR operator. The parenthesises should be around the OR query, not including the "fq:" -- this should be outside the parenthesises (which are not necessary at all). What exactly are you expecting? -Michael Am 24.04.2017 um 12:59 schrieb VJ: > Hi All, > > I

Re: fq performance

2017-03-17 Thread Michael Kuhlmann
Hi Ganesh, you might want to use something like this: fq=access_control:(g1 g2 g5 g99 ...) Then it's only one fq filter per request. Internally it's like an OR condition, but in a more condensed form. I already have used this with up to 500 values without larger performance degradation (but

Re: fq performance

2017-03-16 Thread Michael Kuhlmann
First of all, from what I can see, this won't do what you're expecting. Multiple fq conditions are always combined using AND, so if a user is member of 100 groups, but the document is accessible to only 99 of them, then the user won't find it. Or in other words, if you add a user to some

Re: Sorl 6 with jetty issues

2017-02-20 Thread Michael Kuhlmann
This may be related to SOLR-10130. Am 20.02.2017 um 14:06 schrieb ~$alpha`: > Issues with solr settings while migrating from solr 4.0 to solr6.0. > > Issue Faced: My CPU consumption goes to unacceptable levels. ie. load on > solr4.0 is between 6 to 10 while the load on solr 6 reaches 100 and

Re: Select TOP 10 items from Solr Query

2017-02-17 Thread Michael Kuhlmann
It's not possible to do such thing in one request with faceting only. The problem is that you need a fixed filter on every item when the facet algorithm is iterating over it; you can't look into future elements to find out which ones the top 10 will be. So either you stick with two queries (which

Re: Select TOP 10 items from Solr Query

2017-02-17 Thread Michael Kuhlmann
set > of the itemNo from the 1st query. > > There's definitely more than the 10, but we just need the top 10 in this > case. As the top 10 itemNo may change, so we have to get the returned > result set of the itemNo each time we want to do the JSON Facet. > > Regards, > Edwin

Re: Select TOP 10 items from Solr Query

2017-02-16 Thread Michael Kuhlmann
So basically you want faceting only on the returned result set? I doubt that this is possible without additional queries. The issue is that faceting and result collecting is done within one iteration, so when some document (actually the document's internal id) is fetched as a possible result

Re: Continual garbage collection loop

2017-02-15 Thread Michael Kuhlmann
y 2017 at 14:44 Michael Kuhlmann <k...@solr.info> wrote: >> >> >> Wow, running 36 cores with only half a gigabyte of heap memory is >> *really* optimistic! >> >> I'd raise the heap size to some gigabytes at least and see how it's >> workin

Re: Continual garbage collection loop

2017-02-14 Thread Michael Kuhlmann
Wow, running 36 cores with only half a gigabyte of heap memory is *really* optimistic! I'd raise the heap size to some gigabytes at least and see how it's working then. -Michael Am 14.02.2017 um 15:23 schrieb Leon STRINGER: > Further background on the environment: > > There are 36 cores, with a

Re: FacetField-Result on String-Field contains value with count 0?

2017-01-13 Thread Michael Kuhlmann
Then I don't understand your problem. Solr already does exactly what you want. Maybe the problem is different: I assume that there never was a value of "1" in the index, leading to your confusion. Solr returns all fields as facet result where there was some value at some time as long as the the

Re: Solr Suggester

2016-12-22 Thread Michael Kuhlmann
For the suggester, the field must be indexed. It's not necessary to have it stored. Best, Michael Am 22.12.2016 um 11:24 schrieb Furkan KAMACI: > Hi Emir, > > As far as I know, it should be enough to be stored=true for a suggestion > field? Should it be both indexed and stored? > > Kind Regards,

Re: File system choices?

2016-12-15 Thread Michael Kuhlmann
Yes, and we're doing such things at my company. However we most often do things you shouldn't do; this is one of these. Solr needs to load data quite fast, otherwise you'll be having a performance killer. It's often recommended to use an SSD instead of a normal hard disk; a network share would be

Re: Again : Query formulation help

2016-11-24 Thread Michael Kuhlmann
Hi Prasanna, there's no such filter out-of-the-box. It's similar to the mm parameter in (e)dismax parser, but this only works for full text searches on the same fields. So you have to build the query on your own using all possible permutations: fq=(code1: AND code2:) OR (code1: AND

Re: Multi word synonyms

2016-11-15 Thread Michael Kuhlmann
s > been closed and is available in 6.2. > > On Tue, Nov 15, 2016 at 12:32 PM, Michael Kuhlmann <k...@solr.info> wrote: > >> This is a nice reading though, but that solution depends on the >> precondition that you'll already know your synonyms at index time. >&

Re: Multi word synonyms

2016-11-15 Thread Michael Kuhlmann
Hi Midas, > > I suggest this interesting reading: > > https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > > > On Tue, Nov 15, 2016 at 11:00 AM, Michael Kuhlmann <k...@solr.info> wrote: >

Re: Multi word synonyms

2016-11-15 Thread Michael Kuhlmann
It's not working out of the box, sorry. We're using this plugin: https://github.com/healthonnet/hon-lucene-synonyms#getting-started It's working nicely, but can lead to OOME when you add many synonyms with multiple terms. And I'm not sure whether it#s still working with Solr 6.0. -Michael Am

Re: how to sort search results by count matches

2016-08-02 Thread Michael Kuhlmann
Hi syegorius, are you sure that there's no synonym "planet,world" defined? -Michael Am 02.08.2016 um 15:57 schrieb syegorius: > I have 4 records index by Solr: > > 1 hello planet dear friends > 2 hello world dear friends > 3 nothing > 4 just friends > > I'm searching with this query: > >

Re: Does Solr support 'Value Search'?

2012-08-09 Thread Michael Kuhlmann
On 08.08.2012 20:56, Bing Hua wrote: Not quite understand but I'd explain the problem I had. The response would contain only fields and a list of field values that match the query. Essentially it's querying for field values rather than documents. The underlying use case would be, when typing in

Re: Connect to SOLR over socket file

2012-08-08 Thread Michael Kuhlmann
On 07.08.2012 21:43, Jason Axelson wrote: Hi, Is it possible to connect to SOLR over a socket file as is possible with mysql? I've looked around and I get the feeling that I may be mi-understanding part of SOLR's architecture. Any pointers are welcome. Thanks, Jason Hi Jason, not that I

Re: SOLR 3.4 GeoSpatial Query Returning distance

2012-08-02 Thread Michael Kuhlmann
On 02.08.2012 01:52, Anand Henry wrote: Hi, In SOLR 3.4, while doing a geo-spatial search, is there a way to retrieve the distance of each document from the specified location? Not that I know of. What we did was to read and parse the location field on client side and calculate the distance

Re: Urgent: Facetable but not Searchable Field

2012-08-01 Thread Michael Kuhlmann
On 01.08.2012 13:58, jayakeerthi s wrote: We have a requirement, where we need to implement 2 fields as Facetable, but the values of the fields should not be Searchable. Simply don't search for it, then it's not searchable. Or do I simply don't understand your question? As long as Dismax

Re: Urgent: Facetable but not Searchable Field

2012-08-01 Thread Michael Kuhlmann
On 01.08.2012 15:40, Jack Krupansky wrote: The indexed and stored field attributes are independent, so you can define a facet field as stored but not indexed (stored=true indexed=false), so that the field can be faceted but not indexed. ? A field must be indexed to be used for faceting.

Re: Starts with Query

2012-06-15 Thread Michael Kuhlmann
It's not necessary to do this. You can simply be happy about the fact that all digits are ordered strictly in unicode, so you can use a range query: (f)q={!frange l=0 u=\: incl=true incu=false}title This finds all documents where any token from the title field starts with a digit, so if you

Re: what's better for in memory searching?

2012-06-11 Thread Michael Kuhlmann
Set the swapiness to 0 to avoid memory pages being swapped to disk too early. http://en.wikipedia.org/wiki/Swappiness -Kuli Am 11.06.2012 10:38, schrieb Li Li: I have roughly read the codes of RAMDirectory. it use a list of 1024 byte arrays and many overheads. But as far as I know, using

Re: what's better for in memory searching?

2012-06-11 Thread Michael Kuhlmann
You cannot guarantee this when you're running out of RAM. You'd have a problem then anyway. Why are you caring that much? Did you yet have performance issues? 1GB should load really fast, and both auto warming and OS cache should help a lot as well. With such an index, you usually don't need

Re: timeAllowed flag in the response

2012-06-08 Thread Michael Kuhlmann
Hi Laurent, alas there is currently no such option. The time limit is handled by an internal TimeLimitingCollector, which is used inside SolrIndexSearcher. Since the using method only returns the DocList and doesn't have access to the QueryResult, it won't be easy to return this information

Re: timeAllowed flag in the response

2012-06-08 Thread Michael Kuhlmann
Am 08.06.2012 11:55, schrieb Laurent Vaills: Hi Michael, Thanks for the details that helped me to take a deeper look in the source code. I noticed that each time a TimeExceededException is caught the method setPartialResults(true) is called...which seems to be what I'm looking for. I have to

Re: ERROR 400 undefined field

2012-06-07 Thread Michael Kuhlmann
Am 07.06.2012 09:55, schrieb sheethal shreedhar: http://localhost:8983/solr/select/?q=fruitversion=2.2start=0rows=10indent=on I get HTTP ERROR 400 Problem accessing /solr/select/. Reason: undefined field text Look at your schema.xml. You'll find a line like this:

Re: Query elevation / boosting or something else to guarantee document position

2012-05-31 Thread Michael Kuhlmann
Hi Wenca, I'm a bit late. but maybe you're still interested. There's no such functionality in standard Solr. With sorting, this is not possible, because sort functions only rank each single document, they know nothing about the position of the others. And query elevation is similar, you'll

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can acquire, the more documents can you send in one update. However, I wouldn't pish it too jard anyway. If you can send, say, 100 documents

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
pish it too jard - sounds funny. :) I meant push it too hard. Am 24.05.2012 11:46, schrieb Michael Kuhlmann: There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can acquire, the more documents

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Michael Kuhlmann
, no problem, I will check it and re-generate it. Is it bad to create a file with 5M doc ? Le 24/05/2012 11:46, Michael Kuhlmann a écrit : There is no hard limit for the maximum nunmber of documents per update. It's only memory dependent. The smaller each document, and the more memory Solr can

Re: org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id

2012-05-21 Thread Michael Kuhlmann
Am 21.05.2012 12:07, schrieb Tolga: Hi, I am getting this error: [doc=null] missing required field: id [...] I've got this entry in schema.xml: field name=id type=string stored=true indexed=true/ What to do? Simply make sure that every document you're sending to Solr contains this id

Re: org.apache.solr.common.SolrException: ERROR: [doc=null] missing required field: id

2012-05-21 Thread Michael Kuhlmann
Am 21.05.2012 12:40, schrieb Tolga: How do I verify it exists? I've been crawling the same site and it wasn't giving an error on Thursday. It depends on what you're doing. Are you using nutch? -Kuli

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 05:56, schrieb arjit: Thanks Erick for the reply. I have 6 cores which doesn't contain duplicated data. every core has some unique data. What I thought was when I read it would read parallel 6 cores and join the result and return the query. And this would be efficient then reading

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 13:22, schrieb Sami Siren: Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. If you're not constrained by CPU or IO, in other words have plenty of CPU cores

Re: Solr Shards multi core slower then single big core

2012-05-14 Thread Michael Kuhlmann
Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single

Re: Identify indexed terms of document

2012-05-11 Thread Michael Kuhlmann
Am 10.05.2012 22:27, schrieb Ahmet Arslan: It's possible to see what terms are indexed for a field of document that stored=false? One way is to use http://wiki.apache.org/solr/LukeRequestHandler Another approach is this: - Query for exactly this document, e.g. by using the unique field -

Re: Question about cache

2012-05-11 Thread Michael Kuhlmann
Am 11.05.2012 15:48, schrieb Anderson vasconcelos: Hi Analysing the solr server in glassfish with Jconsole, the Heap Memory Usage don't use more than 4 GB. But, when was executed the TOP comand, the free memory in Operating system is only 200 MB. The physical memory is only 10GB. Why machine

Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann
Am 10.05.2012 14:33, schrieb Bruno Mannina: like that: field name=inventor-countryCH/field field name=inventor-countryFR/field but in this case Ioose the link between inventor and its country? Of course, you need to index the two inventors into two distinct documents. Did you mark those

Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann
I don't know the details of your schema, but I would create fields like name, country, street etc., and a field named role, which contains values like inventor, applicant, etc. How would you do it otherwise? Create only four documents, each fierld containing 80 mio. values? Greetings, Kuli

Re: Partition Question

2012-05-09 Thread Michael Kuhlmann
Am 08.05.2012 23:23, schrieb Lance Norskog: Lucene does not support more 2^32 unique documents, so you need to partition. Just a small note: I doubt that Solr supports more than 2^31 unique documents, as most other Java applications that use int values. Greetings, Kuli

Re: Bridge between Solr and NoSQL

2012-05-08 Thread Michael Kuhlmann
Am 08.05.2012 04:13, schrieb Jeff Schmidt: Francois: Check out DataStax Enterprise 2.0, Solr integrated with Cassandra: http://www.datastax.com/docs/datastax_enterprise2.0/search/index And, Solbase, Solr integrated with HBase: https://github.com/Photobucket/Solbase I'm sure there are others,

Re: Boosting fields in SOLR using Solrj

2012-04-26 Thread Michael Kuhlmann
Am 26.04.2012 00:57, schrieb Joe: Hi, I'm using the solrj API to query my SOLR 3.6 index. I have multiple text fields, which I would like to weight differently. From what I've read, I should be able to do this using the dismax or edismax query types. I've tried the following: SolrQuery query =

Re: Dynamic creation of cores for this use case.

2012-04-26 Thread Michael Kuhlmann
Am 26.04.2012 16:17, schrieb pprabhcisco123: The use case is to create a core for each customer as well as partner . Since its very difficult to create cores statically in solr.xml file for all 4500 customers , is there any way to create the cores dynamically or on the fly. Yes there is.

Re: DIH NoClassFoundError.

2012-04-25 Thread Michael Kuhlmann
Am 25.04.2012 15:57, schrieb stockii: is it not fucking possible to import DIH !?!?!? WTF! It is fucking possible, you just need to either point your goddamn classpath to the data import handler jar in the contrib folders, or you have to add the appropriate contrib folder into the lib dir

Re: RequestHandler versus SearchComponent

2012-03-23 Thread Michael Kuhlmann
Am 23.03.2012 10:29, schrieb Ahmet Arslan: I'm looking at the following. I want to (1) map some query fields to some other query fields and add some things to FL, and then (2) rescore. I can see how to do it as a RequestHandler that makes a parser to get the fields, or I could see making a

Re: RequestHandler versus SearchComponent

2012-03-23 Thread Michael Kuhlmann
Am 23.03.2012 11:17, schrieb Michael Kuhlmann: Adding an own SearchComponent after the regular QueryComponent (or better as a last-element) is goof ... Of course, I meant good, not goof! ;) Greetings, Kuli

Re: is the SolrJ call to add collection of documents a blocking function call ?

2012-03-20 Thread Michael Kuhlmann
Hi Ramdev, add() is a blocking call. Otherwise it had to start an own background thread which is not what a library like Solrj should do (how many threads at most? At which priority? Which thread group? How long keep them pooled?) And, additionally, you might want to know whether the

Re: Master/Slave switch on teh fly. Replication

2012-03-16 Thread Michael Kuhlmann
Am 16.03.2012 15:05, schrieb stockii: i have 8 cores ;-) i thought that replication is defined in solrconfig.xml and this file is only load on startup and i cannot change master to slave and slave to master without restarting the servlet-container ?!?!?! No, you can reload the whole core at

Re: Maybe switching to Solr Cores

2012-03-16 Thread Michael Kuhlmann
Am 16.03.2012 16:42, schrieb Mike Austin: It seems that the biggest real-world advantage is the ability to control core creation and replacement with no downtime. The negative would be the isolation however the are still somewhat isolated. What other benefits and common real-world situations

Re: Too many open files - lots of sockets

2012-03-14 Thread Michael Kuhlmann
I had the same problem, without auto-commit. I never really found out what exactly the reason was, but I think it was because commits were triggered before a previous commit had the chance to finish. We now commit after every minute or 1000 (quite large) documents, whatever comes first. And

Re: Sorting on non-stored field

2012-03-14 Thread Michael Kuhlmann
Am 14.03.2012 11:43, schrieb Finotti Simone: I was wondering: is it possible to sort a Solr result-set on a non-stored value? Yes, it is. It must be indexed, indeed. -Kuli

Re: Too many open files - lots of sockets

2012-03-14 Thread Michael Kuhlmann
Ah, good to know! Thank you! I already had Jetty under suspicion, but we had this failure quite often in October and November, when the bug was not yet reported. -Kuli Am 14.03.2012 12:08, schrieb Colin Howe: After some more digging around I discovered that there was a bug reported in jetty

Re: sort my results alphabetically on facetnames

2012-02-14 Thread Michael Kuhlmann
Hi! On 14.02.2012 13:09, PeterKerk wrote: I want to sort my results on the facetnames (not by their number of results). From the example you gave, I'd assume you don't want to sort by facet names but by facet values. Simply add facet.sort=index to your request; see

Re: Help:Solr can't put all pdf files into index

2012-02-09 Thread Michael Kuhlmann
I'd suggest that you check which documents *exactly* are missing in Solr index. Or find at least one that's missing, and try to figure out how this document differs from the other ones that can be found in Solr. Maybe we can then find out what exact problem there is. Greetings, -Kuli On

Re: Help:Solr can't put all pdf files into index

2012-02-09 Thread Michael Kuhlmann
I don't know much about Tika, but this seems to be a bug in PDFBox. See: https://issues.apache.org/jira/browse/PDFBOX-797 Yoz might also have a look at this: http://stackoverflow.com/questions/7489206/error-while-parsing-binary-files-mostly-pdf At least that's what I found when I googled the

Re: Bad Request (Solr + Weblogic + Oracle DB)

2012-02-02 Thread Michael Kuhlmann
Hi rzao! I think this is the problem: On 02.02.2012 13:59, rzoao wrote: UpdateRequest req = new UpdateRequest(); req.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, false);

Re: java.net.SocketException: Too many open files

2012-01-24 Thread Michael Kuhlmann
Hi Jonty, no, not really. When we first had such problems, we really thought that the number of open files is the problem, so we implemented an algorithm that performed an optimize from time to time to force a segment merge. Due to some misconfiguration, this ran too often. With the result

Re: Relevancy and random sorting

2012-01-12 Thread Michael Kuhlmann
Does the random sort function help you here? http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html However, you will get some very old listings then, if it's okay for you. -Kuli Am 12.01.2012 14:38, schrieb Alexandre Rocco: Erick, This document already has a field

Re: Solr response writer

2011-12-07 Thread Michael Kuhlmann
Am 07.12.2011 14:26, schrieb Finotti Simone: That's the scenario: I have an XML that maps words W to URLs; when a search request is issued by my web client, a query will be issued to my Solr application. If, after stemming, the query matches any in W, the client must be redirected to the

Re: R: Solr response writer

2011-12-07 Thread Michael Kuhlmann
Am 07.12.2011 15:09, schrieb Finotti Simone: I got your and Michael's point. Indeed, I'm not very skilled in web devolpment so there may be something that I'm missing. Anyway, Endeca does something like this: 1. accept a query 2. does the stemming; 3. check if the result of the step 2.

Re: SolR for time-series data

2011-12-05 Thread Michael Kuhlmann
Hi Alan, Solr can do this fast and easy, but I wonder if a simple key-value-store won't fit better for your suits. Do you really only need to query be chart_id, or do you also need to query by time range? In either case, as long as your data fits into an in-memory database, I would

Re: Replication not done for real on commit?

2011-12-05 Thread Michael Kuhlmann
Am 05.12.2011 14:28, schrieb Per Steffensen: Hi Reading http://wiki.apache.org/solr/SolrReplication I notice the pollInterval (guess it should have been pullInterval) on the slaves. That indicate to me that indexed information is not really pushed from master to slave(s) on events defined by

Re: Best practise to automatically change a field value for a specific period of time

2011-12-02 Thread Michael Kuhlmann
Hi Mark, I'm sure you can manage this using function queries somehow, but this is rather complicated, esp. if you both want to return the price and sort on it. I'd rather update the index as soon as a campaign starts or ends. At least that's how we did it when I worked for online shops.

Re: PatternTokenizer failure

2011-11-29 Thread Michael Kuhlmann
Am 29.11.2011 15:20, schrieb Erick Erickson: Hmmm, I tried this in straight Java, no Solr/Lucene involved and the behavior I'm seeing is that no example works if it has more than one whitespace character after the hyphen, including your failure example. I haven't lived inside regexes for long

Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread Michael Kuhlmann
Am 17.11.2011 11:53, schrieb sbarriba: The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-importrows=5000clean=false :)) I think the shell handled the

Re: Problems installing Solr PHP extension

2011-11-16 Thread Michael Kuhlmann
Am 16.11.2011 17:11, schrieb Travis Low: If I can't solve this problem then we'll basically have to write our own PHP Solr client, which would royally suck. Oh, if you really can't get the library work, no problem - there are several PHP clients out there that don't need a PECL installation.

Re: Add copyTo Field without re-indexing?

2011-11-16 Thread Michael Kuhlmann
Am 17.11.2011 08:46, schrieb Kashif Khan: Please advise how we can reindex SOLR with having fields stored=false. we can not reindex data from the beginning just want to read and write indexes from the SOLRJ only. Please advise a solution. I know we can do it using lucene classes using

Re: two word phrase search using dismax

2011-11-15 Thread Michael Kuhlmann
Am 14.11.2011 21:50, schrieb alx...@aim.com: Hello, I use solr3.4 and nutch 1.3. In request handler we have str name=mm2lt;-1 5lt;-2 6lt;90%/str As fas as I know this means that for two word phrase search match must be 100%. However, I noticed that in most cases documents with both words are

Re: creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Michael Kuhlmann
I don't know much about nutch, but it looks like there's simply a commit missing at the end. Try to send a commit, e.g by executing curl http://host:port/solr/core/update -H Content-Type: text/xml --data-binary 'commit /' -Kuli Am 15.11.2011 09:11, schrieb Armin Schleicher: hi there,

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread Michael Kuhlmann
Hi, Am 15.11.2011 10:25, schrieb rajini maski: fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ [...] fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ [...] field

Re: Solr 3.3 Sorting is not working for long fields

2011-11-14 Thread Michael Kuhlmann
Am 14.11.2011 09:33, schrieb rajini maski: query : http://localhost:8091/Group/select?/indent=onq=studyid:120sort=studyidasc,groupid asc,subjectid ascstart=0rows=10 Is it a copy-and-paste error, or did you realls sort on studyidasc? I don't think you have a field studyidasc, and Solr

Re: representing latlontype in pojo

2011-11-09 Thread Michael Kuhlmann
Am 08.11.2011 23:38, schrieb Cam Bazz: How can I store a 2d point and index it to a field type that is latlontype, if I am using solrj? Simply use a String field. The format is $latitude,$longitude. -Kuli

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread Michael Kuhlmann
Hi, this is not exactly true. In Solr, you can't have the wildcard operator on both sides of the operator. However, you can tokenize your fields and simply query for Solr. This is what's Solr made for. :) -Kuli Am 01.11.2011 13:24, schrieb François Schiettecatte: Arshad Actually it is

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread Michael Kuhlmann
Am 01.11.2011 16:06, schrieb Erick Erickson: NGrams are often used in Solr for this case, but they will also add to your index size. It might be worthwhile to look closely at your user requirements before going ahead and supporting this functionality Best Erick My opinion. Wildcards are

Re: Always return total number of documents

2011-10-28 Thread Michael Kuhlmann
Am 28.10.2011 11:16, schrieb Robert Brown: Is there no way to return the total number of docs as part of a search? No, it isn't. Usually this information is of absolutely no value to the end user. A workaround would be to add some field to the schema that has the same value for every document,

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-27 Thread Michael Kuhlmann
Am 26.10.2011 18:29, schrieb Shawn Heisey: For inserting, I do use a Collection of SolrInputDocuments. The delete process grabs values from idx_delete, does a query like the above (the part that's slow in Java), then if any documents are found, issues a deleteByQuery with the same string.

  1   2   >