Re: Query ReRanking question

2015-08-05 Thread Aman Tandon
Hi, Very-2 nice mail thread. I think many people might be facing the problem of maintaining the relevance and recency both at the same time. boost=max(recip(ms(NOW/HOUR,publish_date),7.889e-10,1,1),scale(query ($q),0,1)) Currently in our search we are using the recency without any condition.

Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi, Your response says wt=json, but your solrconfig excerpt says wt=velocity. May be you are hitting a different request handler? What happens when you submit your query as q=Warszawadf=text_index On Wednesday, August 5, 2015 8:28 AM, Michał Oleś michal.o...@gmail.com wrote: I installed solr

Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread Upayavira
How did you trigger that exception, and can you guve the full exception? Upayavira On Tue, Aug 4, 2015, at 09:14 PM, wwang525 wrote: Hi Upayavira, I have physically cleaned up the files under index directory, and re-index did not fix the problem. The following is an example of the

Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread Upayavira
Fwiw, it wouldn't surprise me if you can't facet or sort on a trie field with a precision step above 0. That feature indexes at multiple precisions to make range queries efficient. You may need to index the value twice, once with a precision step for ranges, and once without (or zero rather).

Re: Solr SolrEntityProcessor - can it take customer parameters?

2015-08-05 Thread Mikhail Khludnev
it should work with placehoder syntax like ${fromDate}. Have you tried? On Wed, Aug 5, 2015 at 1:57 AM, sergeyk z...@hotmail.com wrote: I's like to use SolrEntityProcessor for import some documents from one solr cloud to another solr cloud. The date range is dynamic and can change. Is there

High CPU DistributedQueue and OverseerAutoReplicaFailoverThread

2015-08-05 Thread Markus Jelsma
Hello  - we have a single Solr 5.2.1 node that (for now) contains four single shard collections. Only two collections actually contain data and are queried. The machine has some unusual latency that led me to sample the CPU time with VisualVM. On that node we see that

Solr spell check not showing any suggestions for other language

2015-08-05 Thread talha
Solr spell check is not showing any suggestions for other language.I have indexed mutli-languages (english and bangla) in same core.It's showing suggestions for wrongly spelt english word but in case of wrongly spelt bangla word it showing correctlySpelled = false but not showing any suggestions

Re: multiple but identical suggestions in autocomplete

2015-08-05 Thread Nutch Solr User
You will need to call this service from UI as you are calling suggester component currently. (may be on every key-press event in text box). You will pass required parameters too. Service will internally form a solr suggester query and query Solr. From the returned response it will keep only

how to extend JavaBinCodec and make it available in solrj api

2015-08-05 Thread Dmitry Kan
Hello, Solr: 5.2.1 class: org.apache.solr.common.util.JavaBinCodec I'm working on a custom data structure for the highlighter. The data structure is ready in JSON and XML formats. I need also JavaBin format. The data structure is already made serializable by extending the WritableValue class

Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
Hi, for months/years, I have been experiencing occasional very long (30s+) hangs when programmatically initializing a solr container in Java. The application has worked for years in production with this setup without any problems apart from this. The code I have is this here: public void

Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi Upayavira, I edited the definition of tint to have a precisionStep=0 for DateDep (i.e.: departure date). This field is used as filter query and also used in faceted search. The following are definitions: fieldType name=tint class=solr.TrieIntField precisionStep=0

RE: Duplicate Documents

2015-08-05 Thread Tarala, Magesh
I deleted the index and re-indexed. Duplicates went away. Have not identified root cause, but looks like updating documents is causing it sporadically. Going to try deleting the document and then update. -Original Message- From: Tarala, Magesh Sent: Monday, August 03, 2015 8:27 AM

Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi Upayavira, A bit more explanation on DateDep. This value in database is expressed as a varchar (8), and has the format of 20150803. I map it to be an SortableIntField before, and it worked with the filter query and faceted search. After I changed it to be TrieIntField, tried re-indexing

Embedded Solr now deprecated?

2015-08-05 Thread Robert Krüger
Hi, I tried to upgrade my application from solr 4 to 5 and just now realized that embedded use of solr seems to be on the way out. Is that correct or is there a just new API to use for that? Thanks in advance, Robert

Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
OK, now that I had a reproducible setup I could debug where it hangs: public SystemInfoHandler(CoreContainer cc) { super(); this.cc = cc; init(); } private void init() { try { InetAddress addr = InetAddress.getLocalHost(); hostname = addr.getCanonicalHostName();

Re: Embedded Solr now deprecated?

2015-08-05 Thread Erick Erickson
Where did you see that? Maybe I missed something yet again. This is unrelated to whether we ship a WAR if that's being conflated here. I rather doubt that embedded is on it's way out, although my memory isn't what it used to be. For starters, MapReduceIndexerTool uses it, so it gets regular

Re: Solr SolrEntityProcessor - can it take customer parameters?

2015-08-05 Thread Shawn Heisey
On 8/4/2015 4:57 PM, sergeyk wrote: I's like to use SolrEntityProcessor for import some documents from one solr cloud to another solr cloud. The date range is dynamic and can change. Is there a way to pass, say solr/core/data-import?fromDate=some datetoDate=some date You can use syntax

RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha, Possibly this english-specific analysis in your text_suggest field is interfering: solr.EnglishPossessiveFilterFactory ? Another guess is you're receiving more than 5 results and maxResultsForSuggest is set to 5. But I'm not sure. Maybe someone can help with more information from

Re: Embedded Solr now deprecated?

2015-08-05 Thread Shawn Heisey
On 8/5/2015 7:09 AM, Robert Krüger wrote: I tried to upgrade my application from solr 4 to 5 and just now realized that embedded use of solr seems to be on the way out. Is that correct or is there a just new API to use for that? Building on Erick's reply: I doubt that the embedded server is

Re: Embedded Solr now deprecated?

2015-08-05 Thread Alexandre Rafalovitch
I thought the Embedded server was good for a scenario where you wanted quickly to build a core with lots of documents locally. And then, move the core into production and swap it in. So you minimize the network traffic. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-05 Thread Shawn Heisey
On 8/5/2015 5:38 AM, Dmitry Kan wrote: Solr: 5.2.1 class: org.apache.solr.common.util.JavaBinCodec I'm working on a custom data structure for the highlighter. The data structure is ready in JSON and XML formats. I need also JavaBin format. The data structure is already made serializable by

SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Hello, Is /clusterstate.json in Zookeeper updated with collection state if a collection is created with server running in Solr Cloud mode without creating a core through coreAdmin or providing a core.properties . I find that there is a state.json present under /collections/collection_name which

RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread talha
Dear James Thank you for your reply. I tested analyser without “solr.EnglishPossessiveFilterFactory” but still no luck. I also updated analyser please find this below. fieldType name=text_suggest class=solr.TextField positionIncrementGap=100 analyzer

Re: Initializing core takes very long at times

2015-08-05 Thread Shawn Heisey
On 8/5/2015 7:56 AM, Robert Krüger wrote: OK, now that I had a reproducible setup I could debug where it hangs: public SystemInfoHandler(CoreContainer cc) { super(); this.cc = cc; init(); } private void init() { try { InetAddress addr =

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
@Upayavira Thanks these thing are most useful for my understanding I have thing about i will create XML or CVS file from my requirement using java Then Index it via HTTP post or bin/post I am not using DIH because i did't get any of link or idea how to split data and add to solr one by

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Upayavira
If you are using Java, you will likely find SolrJ the best way - it uses serialised Java objects to communicate with Solr - you don't need to worry about that. Just use code similar to that earlier in the thread. No XML, no CSV, just simple java code. Upayavira On Wed, Aug 5, 2015, at 04:50 PM,

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
filesystem are about 40 millions of document it will iterate 40 times how may solrJ could not handle 40m times loops(before indexing i have to split values from filename and make some operation then index to Solr) Is it will continuous indexing using 40m times or i have to sleep in between some

Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Michał Oleś
Thank you for answer. When I execute the query using q=Warszawadf=text_index instead of q=text_index:Warszawa nothing changed. If I remove wt=json from query I got response in xml but also without highlight results.

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Upayavira
Post your docs in sets of 1000. Create a: ListSolrInputDocument docs Then add 1000 docs to it, then client.add(docs); Repeat until your 40m are indexed. Upayavira On Wed, Aug 5, 2015, at 05:07 PM, Mugeesh Husain wrote: filesystem are about 40 millions of document it will iterate 40 times

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
@Mikhail Use of data import handler ,if i define my baseDir is D:/work/folder. Will it work for sub-folder and sub-folder of sub-folder ... etc also.? -- View this message in context: http://lucene.472066.n3.nabble.com/Can-Apache-Solr-Handle-TeraByte-Large-Data-tp3656484p4221063.html Sent

Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
I am shipping solr as a local search engine with our software, so I have no way of controlling that environment. Many other software packages (rdbmss, nosql engines etc.) work well in such a setup (as does solr except this problem). The problem is that in this case (AFAICS) the host cannot be

Re: Can Apache Solr Handle TeraByte Large Data

2015-08-05 Thread Mugeesh Husain
thanks you Upayavira, I think i have done all these thing using SolrJ which was usefull before starting development of the project. I hope i will not got any of issue using SolrJ and got lots of stuff using it. Thanks Mugeesh Husain -- View this message in context:

Re: Initializing core takes very long at times

2015-08-05 Thread Erick Erickson
All patches welcome! On Wed, Aug 5, 2015 at 12:40 PM, Robert Krüger krue...@lesspain.de wrote: I am shipping solr as a local search engine with our software, so I have no way of controlling that environment. Many other software packages (rdbmss, nosql engines etc.) work well in such a setup

Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Erick Erickson
Yes. The older-style ZK entity was all-in-one in /clusterstate.json. Recently we've moved to a per-collection state.json instead, to avoid the thundering herd problem. In that state, /clusterstate.json is completely ignored and, as you see, not updated. Hmmm, might be worth raising a JIRA to

Re: Embedded Solr now deprecated?

2015-08-05 Thread Robert Krüger
I just saw lots of deprecation warnings in my current code and a method that was removed, which is why I asked. Regarding the use case, I am embedding it with a desktop application just as others use java-based no-sql or rdbms engines and that makes sense architecturally in my case and is just

Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Thanks for your response. What version of solr is this change effective from? Will raise a jira. Thanks Suma On Wed, Aug 5, 2015 at 10:37 PM, Erick Erickson erickerick...@gmail.com wrote: Yes. The older-style ZK entity was all-in-one in /clusterstate.json. Recently we've moved to a

Re: Initializing core takes very long at times

2015-08-05 Thread Alexandre Rafalovitch
I wonder if that's also something that could be resolved by having a custom Network level handler, on a pure Java level. I see to vaguely recall it was possible. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 5 August 2015

Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Shawn Heisey
On 8/5/2015 9:22 AM, Suma Shivaprasad wrote: Hello, Is /clusterstate.json in Zookeeper updated with collection state if a collection is created with server running in Solr Cloud mode without creating a core through coreAdmin or providing a core.properties . I find that there is a state.json

Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Suma Shivaprasad
Thanks Shawn. Does this mean in client code, wherever we are using the API ZkStateReader.getClusterState.getCollections to get status should be changed to CollectionsAdminRequest.ClusterStatus for each collection or will that API continue to work ? Thanks Suma On Wed, Aug 5, 2015 at 11:00 PM,

Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi, bq: I don't even see highlighting section in results I mean, it is possible that you are hitting a request/search handler that does not have highlighting component registered. This is possible when you explicitly register components (query, facet, highlighting etc). Lets first make sure

RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha, Can you try putting your queried keyword in spellcheck.q ? James Dyer Ingram Content Group -Original Message- From: talha [mailto:talh...@gmail.com] Sent: Wednesday, August 05, 2015 10:13 AM To: solr-user@lucene.apache.org Subject: RE: Solr spell check not showing any

Re: Initializing core takes very long at times

2015-08-05 Thread Robert Krüger
I just posted on lucene-dev. I think just replacing getCanonicalHostName by getHostName might do the Job. At least that's exactly what Logback does for this purpose: http://logback.qos.ch/xref/ch/qos/logback/core/util/ContextUtil.html On Wed, Aug 5, 2015 at 6:57 PM, Erick Erickson

Re: TrieIntField not working in Solr 4.7 ?

2015-08-05 Thread wwang525
Hi All, It looks like Numeric field can not be used for faceting if docValues=true. The following article seemed to indicate an issue in this scenario: https://issues.apache.org/jira/browse/SOLR-7495 Unexpected docvalues type NUMERIC when grouping by a int facet -- View this message in

Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Michał Oleś
Hi, I checked and for me config looks alright but if you can take a look it will be great. Here is whole solrconfig.xml: http://pastebin.com/7YfVZA90 and here is full schema.xml: http://pastebin.com/LgeAvtFf and query result with enabled debug: http://pastebin.com/i74Wyep3

Re: Embedded Solr now deprecated?

2015-08-05 Thread Erick Erickson
Hmmm, you may want to investigate the new no-war solution. Solr runs as a service with start/stop scripts. Currently it uses an underlying Jetty container, but that will (probably) eventually change and is pretty much considered an implementation detail. Not quite sure whether it'd be easier or

Re: SolrCloud on 5.2.1 cluster state

2015-08-05 Thread Erick Erickson
The API shouldn't be changing, did you run into any errors? Don't bother to raise a JIRA, I checked through IM and the /clusterstate.json is still being used as a watch point for things like creating collections, and there are other JIRAs afoot that will take care of this at the at the right

Re: Solr 5.2.1 highlighting results are not available

2015-08-05 Thread Ahmet Arslan
Hi, I couldn't find anything suspicious. It was allowed to highlight on an indexed=false field as long as a tokenizer defined on it: https://cwiki.apache.org/confluence/display/solr/Field+Properties+by+Use+Case May be that is changed. Can you try to highlight on a both indexed and stored

RE: Embedded Solr now deprecated?

2015-08-05 Thread Ken Krugler
Hi Shawn, We have a different use case than the ones you covered in your response to Robert (below), which I wanted to call out. We currently use the embedded server when building indexes as part of a Hadoop workflow. The results get copied to a production analytics server and swapped in on a