Re: Boosting documents by categorical preferences

2013-11-20 Thread Amit Nithian
I thought about that but my concern/question was how. If I used the pow function then I'm still boosting the bad categories by a small amount..alternatively I could multiply by a negative number but does that work as expected? I haven't done much with negative boosting except for the sledgehammer

DataImportHandler on multi core - limiting concurrent runs on more than N cores

2013-11-20 Thread Patrice Monroe Pustavrh
Hi, I am currently run Solr with 10 cores. It works fine with me, until I try to run update on too may cores (each core uses more than enough CPU and memory so machine becomes really slow). I've googled around and tried to find whether there is an option in SOLR to prevent to many simultaneous

DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Hi there, I'm not fully understand what kind of usage example that DocValues can be used? When I set field docValues=true, do i need to change anyhting in xml that I sent to solr for indexing? Please point me. Thanks Floyd PS: I've googled and read lots of DocValues discussion but confused.

Re: DocValues uasge and senarios?

2013-11-20 Thread Yago Riveiro
Hi Floyd, DocValues are useful for sorting and faceting per example. You don't need to change nothing in your xml's, the only thing that you need to do is set the docValues=true in your field definition in the schema. If you don't want use the default implementation (all loaded in the heap),

Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Hi Yago Thanks for you reply. I once thought that DocValues feature is one for me to store some extra values. May I summarized that DocValues is a feature that speed up sorting and faceting? Floyd 2013/11/20 Yago Riveiro yago.rive...@gmail.com Hi Floyd, DocValues are useful for sorting

Re: DocValues uasge and senarios?

2013-11-20 Thread Yago Riveiro
You should understand the DocValues as feature that allow you to do sorting and faceting without blow the heap. They are not necessary faster than the traditional method, they are more memory efficient and in huge indexes this is the main limitation. This post resumes the docvalues feature

Re: How to index X™ as #8482; (HTML decimal entity)

2013-11-20 Thread Uwe Reh
What's about having a simple charfilter in the analyzer queue for indexing *and* searching. e.g charFilter class=solr.PatternReplaceFilterFactory pattern=™ replacement=#8482; / or charFilter class=solr.MappingCharFilterFactory mapping=mapping-specials.txt / Uwe Am 19.11.2013 23:46, schrieb

Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Thanks Yago, I've read this article http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/ But I don't understand well. I'll try to figure out the missing part. Thanks for helping. Floyd 2013/11/20 Yago Riveiro yago.rive...@gmail.com You should understand the DocValues as feature

Re: Error with Solr 4.4.0, Glassfish, and CentOS 6.2

2013-11-20 Thread Ericvb
Hi We had the same issue as mentioned we added -Djavax.net.ssl.keyStorePassword=changeit but we also had to add -Djavax.net.ssl.trustStorePassword=changeit that did it for us -- View this message in context:

Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a

Re: DataImportHandler on multi core - limiting concurrent runs on more than N cores

2013-11-20 Thread Shalin Shekhar Mangar
No, there is no synchronisation between data import handlers on different cores. You will have to implement this sort of queuing logic on your application's side. On Wed, Nov 20, 2013 at 2:23 PM, Patrice Monroe Pustavrh patrice.mon...@bisnode.si wrote: Hi, I am currently run Solr with 10 cores.

Re: What is the difference between attorney:(Roger Miller) and attorney:Roger Miller

2013-11-20 Thread Erick Erickson
debug=query is your friend! On Tue, Nov 19, 2013 at 4:17 PM, Rafał Kuć r@solr.pl wrote: Hello! Terms surrounded by characters will be treated as phrase query. So, if your default query operator is OR, the attorney:(Roger Miller) will result in documents with first or second (or both)

Re: {!cache=false} for regular queries?

2013-11-20 Thread Erick Erickson
But I don't know whether it's worth worrying about. queryResultCache is pretty small. I think of it as a map where the key is the text of the query and the value is an int[queryWindowSize]. So bypassing the cache is probably not going to make much difference. YMMV of course. Best Erick On

Solr is deleting newly created index from index folder

2013-11-20 Thread vishalgupta084
I am runnig cron job for indexing and commiting documents for solr search. Earlier everything was fine. But from some time it is deleting indexes from index folder. Whenever I update any document or create any new document, it gets indexed and commited and appear in search but after some hour

Auto optimized of Solr indexing results

2013-11-20 Thread Bayu Widyasanyata
Hi, After successfully configured re-crawling script, I sometimes checked and found on Solr Admin that Optimized status of my collection is not optimized (slash icon). Hence I did optimized steps manually. How to make my crawling optimized automatically? Should we restart Solr (I use Tomcat)

Re: Solr is deleting newly created index from index folder

2013-11-20 Thread Erick Erickson
I cannot imagine that Solr suddenly starts deleting indexes without you changing anything, although all things are possible. Sanity check: do you have at least as much free space on your disk as the total size of your index on disk? Your query will delete everything from your index with an

Re: Auto optimized of Solr indexing results

2013-11-20 Thread Erick Erickson
You probably shouldn't optimize at all. The default TieredMergePolicy will eventually purge the deleted files' data, which is really what optimize does. So despite its name, most of the time it's not really worth the effort. Take a look at your Solr admin page, the overview link for a core. If

Re: Solr is deleting newly created index from index folder

2013-11-20 Thread Jack Krupansky
You may be hitting a query parser bug/nuance, that a purely negative sub-query needs to have a *:* added so that it is not purely negative. So, replace: AND -endtime:1970-01-01T01:00:00Z with AND (*:* -endtime:1970-01-01T01:00:00Z) Or, as Erick mentioned in his reply, you don't really not

Re: How to index X™ as #8482; (HTML decimal entity)

2013-11-20 Thread Jack Krupansky
Any analysis filtering affects the indexed value only, but the stored value would be unchanged from the original input value. An update processor lets you modify the original input value that will be stored. -- Jack Krupansky -Original Message- From: Uwe Reh Sent: Wednesday,

Re: {!cache=false} for regular queries?

2013-11-20 Thread Mikhail Khludnev
Eric, it's worth to mention that queries which sorts by field can potentially blows filter cache http://wiki.apache.org/solr/SolrCaching#useFilterForSortedQuery here is the hint might work out. On Wed, Nov 20, 2013 at 4:30 PM, Erick Erickson erickerick...@gmail.comwrote: But I don't know

Re: Swapping Cores

2013-11-20 Thread Tirthankar Chatterjee
Hi Shawn, It just slipped my mind to mention the details of my solr version. Good point and thought from your side. Thanks for checking back my emails. I am currently using SOLR4.3 but not SOLR CLOUD. WE have a technical documentation site which keeps changing with some new files and some

Issues faced after docValues migration

2013-11-20 Thread vicky desai
Hi, I am using solr 4.3 version. I am planning to use the docValues feature introduced in solr 4.2. Although I see a significant improvement in facet and group query , there is a degrade in group.facet and group.ngroups query. Has anybody faced a similar issue? Any work arounds? -- View

Re: {!cache=false} for regular queries?

2013-11-20 Thread Erick Erickson
Mikhail: Interesting point! Thanks! I haven't looked at the implementation enough to know, but it looks like these are unrelated? You'd have to happen to have a fq clause you'd already submitted (and cached) for this to happen? But like I said, I don't know the code. I didn't realize this even

Re: Auto optimized of Solr indexing results

2013-11-20 Thread Bayu Widyasanyata
Thanks Erick. I will check that on next round. --- wassalam, [bayu] /sent from Android phone/ On Nov 20, 2013 7:45 PM, Erick Erickson erickerick...@gmail.com wrote: You probably shouldn't optimize at all. The default TieredMergePolicy will eventually purge the deleted files' data, which is

Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Otis Gospodnetic
Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not

Support for Numeric DocValues Updates in Solr?

2013-11-20 Thread Otis Gospodnetic
Hi, Numeric DocValues Updates functionality that came via https://issues.apache.org/jira/browse/LUCENE-5189 sounds very valuable, while we wait for full/arbitrary field updates (https://issues.apache.org/jira/browse/LUCENE-4258). Would it make sense to add support for Numeric DocValues Updates

How to Configure Highlighting for Solr

2013-11-20 Thread Furkan KAMACI
I have setup my highlight as follows: bool name=hltrue/bool str name=hl.flname age address/str However I don't want *name* be highlighted *but *included inside response: highlighting: { Something_myid: { name: emSomething/em bla bla, age: emSomething/em age bla bla,

Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Otis Gospodnetic
Hi, When full index replication is happening via SnapPuller, a temporary timestamped index dir is created. Questions: 1) Under normal circumstances could more than 1 timestamped index directory ever be present? 2) Should there always be an the .../data/index directory present? I'm asking

Solr Highlighting Response Type

2013-11-20 Thread Furkan KAMACI
Here is an example from wiki: IteratorSolrDocument iter = queryResponse.getResults().iterator(); while (iter.hasNext()) { SolrDocument resultDoc = iter.next(); String content = (String) resultDoc.getFieldValue(content); String id = (String) resultDoc.getFieldValue(id);

Re: Support for Numeric DocValues Updates in Solr?

2013-11-20 Thread Gopal Patwa
+1 to add this support in Solr On Wed, Nov 20, 2013 at 7:16 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Numeric DocValues Updates functionality that came via https://issues.apache.org/jira/browse/LUCENE-5189 sounds very valuable, while we wait for full/arbitrary field

Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread michael.boom
I encountered this problem often when i restarted a solr instance before replication was finished more than once. I would then have multiple timestamped directories and the index directory. However, the index.properties points to the active index directory. The moment when the replication

Re: Option to enforce a majority quorum approach to accepting updates in SolrCloud?

2013-11-20 Thread Timothy Potter
Hi Otis, I think these are related problems but giving the ability to enforce a majority quorum among the total replica set for a shard is not the same as hinted handoff in the Cassandra sense. Cass's hinted handed allows you to say it's ok to send the write somewhere and somehow it'll make its

RE: facet method=enum and uninvertedfield limitations

2013-11-20 Thread Lemke, Michael SZ/HZA-ZSW
On Wednesday, November 20, 2013 7:37 AM, Dmitry Kan wrote: Thanks for your reply. Since you are faceting on a text field (is this correct?) you deal with a lot of unique values in it. Yes, this is a text field and we experimented with reducing the index. As I said in my original question the

Re: How to index X™ as #8482; (HTML decimal entity)

2013-11-20 Thread Walter Underwood
Again, I'd like to know why this is wanted. It sounds like an X-Y, problem. Storing Unicode characters as XML/HTML encoded character references is an extremely bad idea. wunder On Nov 20, 2013, at 5:01 AM, Jack Krupansky j...@basetechnology.com wrote: Any analysis filtering affects the

Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Daniel Collins
In our experience (with SolrCloud), if you trigger a full replication (e.g. new replica), you get the timestamp directory, it never renames back to just index. Since index.properties gives you the name of the real directory, we had never considered that a problem/bug. Why bother with the rename

Solr Docvalues grouping

2013-11-20 Thread GOYAL, ANKUR
Hi, I am using Solr 4.5.1. and I am planning to use docValues attribute for a string type. The values in that field change only once a day. I would like to only group on that field. At the following link :- http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/ it is mentioned that

Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Mark Miller
There might be a JIRA issue out there about replication not cleaning up on all fails - e.g. on startup or something - kind of rings a bell…if so, it will be addressed eventually. Otherwise, you might have two for a bit just due to multiple searchers being around at once for a while or

Re: How to index X™ as #8482; (HTML decimal entity)

2013-11-20 Thread Jack Krupansky
AFAICT, it's not an extremely bad idea - using SGML/HTML as a format for storing text to be rendered. If you disagree - try explaining yourself. But maybe TM should be encoded as trade;. Ditto for other named SGML entities. -- Jack Krupansky -Original Message- From: Walter

Suggester - how to return exact match?

2013-11-20 Thread Mirko
Hi, we implemented a Solr suggester (http://wiki.apache.org/solr/Suggester) that uses a file based dictionary. We use the results of the suggester to populate a dropdown field of a search field on a webpage. Our dictionary (autosuggest.txt) contains: foo bar Our suggester has the following

Re: Solr spatial search within the polygon

2013-11-20 Thread Smiley, David W.
Dhanesh, I'm pretty sure that the coordinates are in the right position. 9.445890,76.540970 is in India, precisely in Kerala state :) My suspicion was wright; you have all of your latitudes and longitudes in the wrong position. Your example that I quote you on above is correct (lat,lon) ,

Re: Indexing different customer customized field values

2013-11-20 Thread kchellappa
Thanks Otis We also thought about having multiple fields, but thought that having too many fields will be an issue. I see threads about too many fields is an issue for sort (we don't expect to sort on these), but look through the archives. -- View this message in context:

Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
At the Lucene level, I think it would require a directory implementation which writes to a remote node directly. Otherwise, on the solr side, we must move the leader itself to another node which has enough disk space and then split the index. On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic

Re: DocValues uasge and senarios?

2013-11-20 Thread Chris Hostetter
Perhaps this can help you make sense of the advantages... https://cwiki.apache.org/confluence/display/solr/DocValues : Date: Wed, 20 Nov 2013 18:45:04 +0800 : From: Floyd Wu floyd...@gmail.com : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: Re: DocValues

Re: How to Configure Highlighting for Solr

2013-11-20 Thread Stefan Matheis
Solr is using the UniqueKey you defined for your documents, that shouldn't be a problem, since you can lookup the document from the list of documents in the main response? And there is actually a ticket, which would allow it to inline the highlight response with DocTransfomers:

SolrJ - HttpSolrServer - allow setting custom HTTP headers

2013-11-20 Thread Eugen Paraschiv
Hi - a quick question about a low(ish)-level usecase of SolrJ. I am trying to set a custom HTTP Header on the request that SolrJ is sending out to the Solr Server - and as far as I can tell - there isn't a clear way of doing that. HttpSolrServer.request crates the HttpGet request and sends it -

Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
Hi, Quick question about the HttpSolrServer implementation - I would like to extend some of the functionality of this class - but when I extend it - I'm having issues with how extensible it is. For example - some of the details are not visible externally - setters exist for maxRetries and

Re: Suggester - how to return exact match?

2013-11-20 Thread Developer
May be there is a way to do this but it doesn't make sense to return the same search query as a suggestion (Search query is not a suggestion as it might or might not be present in the index). AFAIK you can use various look up algorithm to get the suggestion list and they lookup the terms based on

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Mark Miller
Feel free to file a JIRA issue with the changes you think make sense. - Mark On Nov 20, 2013, at 4:21 PM, Eugen Paraschiv eug...@odesk.com wrote: Hi, Quick question about the HttpSolrServer implementation - I would like to extend some of the functionality of this class - but when I extend it

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
Will do - thanks for the quick feedback. Eugen. On Thu, Nov 21, 2013 at 12:06 AM, Mark Miller markrmil...@gmail.com wrote: Feel free to file a JIRA issue with the changes you think make sense. - Mark On Nov 20, 2013, at 4:21 PM, Eugen Paraschiv eug...@odesk.com wrote: Hi, Quick

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Shawn Heisey
On 11/20/2013 2:21 PM, Eugen Paraschiv wrote: Quick question about the HttpSolrServer implementation - I would like to extend some of the functionality of this class - but when I extend it - I'm having issues with how extensible it is. For example - some of the details are not visible externally

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
The reason I needed access to internal details of the class - and it's not just these 2 fields (I used these just as a quick example) - was that I was trying to extend the class and overload the request method. As soon as I tried to do that, I noticed that I really couldn't easily do so - because

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Chris Hostetter
: The reason I needed access to internal details of the class - and it's not : just these 2 fields (I used these just as a quick example) - was that I was : trying to extend the class and overload the request method. As soon as I : tried to do that, I noticed that I really couldn't easily do so -

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Shawn Heisey
On 11/20/2013 3:28 PM, Eugen Paraschiv wrote: The reason I needed access to internal details of the class - and it's not just these 2 fields (I used these just as a quick example) - was that I was trying to extend the class and overload the request method. As soon as I tried to do that, I

Re: Extensibility of HttpSolrServer

2013-11-20 Thread Eugen Paraschiv
First - I completely agree with keeping the moving parts to a minimum - but I do think that's a case by case decision, and in this particular case - it may just be worth opening up a little. Then - adding in a custom HttpClient may work - but HttpHeaders are set on a request (and may differ from

Re: How to Configure Highlighting for Solr

2013-11-20 Thread Furkan KAMACI
I have implemented a search API that interacts with Solr. I don't retrieve id field. Id field is a transformed version of name field and it helps to make a quicker search on index. It would be nice to declare to Solr that I have another field that is unique too and it would be nice to group

csv does not return custom fields (distance)

2013-11-20 Thread GaneshSe
I am using spacial search feature in Solr (4.0) version. When I try to extract the csv (using wt=csv option) using the edismax parser, I dont get all the fields in the CSV output as specified in the fl parameter. Only the schema fields are coming out in CSV and the score, the custom fields like

How to retain the original format of input document in search results in SOLR - Tomcat

2013-11-20 Thread ramesh py
Hi All, I am new to apache solr. Recently I could able to configure the solr with tomcat successfully. And its working fine except the format of the search results i.e., the format of the search results not displaying as like as input document. I am doing the below things 1.

How to configure new path for velocity for SolrCloud?

2013-11-20 Thread John W.Lee
I deploy a solrcloud with three instances of solr, which use conf/velocity by default. Now I want to add another velocity configuration folder conf/new_vel, but I couldn't make it work when I change str name=v.base_dir element of request handler of solrconfig.xml. I have tried conf/new_vel,