Re: EdgeNGramFilterFactory for Chinese characters

2015-10-26 Thread Zheng Lin Edwin Yeo
Hi Tomoko, Thank you for your advice. Will look into the java source code of the Token Filters. Regards, Edwin On 26 October 2015 at 13:16, Tomoko Uchida wrote: > > Will try to see if there is anyway to managed it by only a single field? > > Of course you can

copy data between collection

2015-10-26 Thread Chaushu, Shani
Hi, Is there an API to copy all the documents from one collection to another collection in the same solr server simply? I'm using solr cloud 4.10 Thanks, Shani - Intel Electronics Ltd. This e-mail and any attachments may

Re: copy data between collection

2015-10-26 Thread Upayavira
Hi Shani, There isn't a SolrCloud way to do it. A proper 'clone this collection' feature would be a very useful thing. However, I have managed to do it, in a way that involves some caveats: * you should only do this on a collection that has no replicas. Add replicas *after* cloning the index

Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Adrian Liew
Hi all, I have setup a 3 server Zookeeper cluster by following the instructions provided from Zookeeper site: I am having experiences trying to zkCli.bat into the Zookeeper services on 3 EC2 instances once after I have started the ZK services on all 3 servers. For example, I have setup my

Re: getting cached terms inside UpdateRequestProcessor...

2015-10-26 Thread Roxana Danger
Sorry the delay on my reply. I have tried to use the Documents screen for executing my update request processor (with JSON), but it needs the document Id to be specified. So, this is not a good solution as I need to update all indexed documents... I will try on importEnd event. Any other idea?

Re: Query differently or change fieldtype

2015-10-26 Thread Upayavira
Use the analysis tab on the admin UI to see what analysis is doing to your terms. Then bear in mind that a query parser will split on space. So, you might want to do clientName:"st ju me" to make the tokenisation happen within the analysis chain rather than the query parser. Upayavira On Mon,

Re: Query differently or change fieldtype

2015-10-26 Thread Ray Niu
I think this is how StandardTokenizerFactory works, if you want different behavior, you should try to use a different tokenizer, also like Upayavira said,use the analysis tab on the admin UI to see what analysis is doing to your terms. 2015-10-26 12:33 GMT-07:00 Brian Narsi :

Solr hard commit

2015-10-26 Thread Rallavagu
All, Are memory mapped files (mmap) flushed to disk during "hard commit"? If yes, should we disable OS level (Linux for example) memory mapped flush? I am referring to following for mmap files for Lucene/Solr http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Linux

Re: Two seperate intance of Solr on the same machine

2015-10-26 Thread Pushkar Raste
It depends on your case. If you don't mind logs from 3 different instances inter-mingled with each other you should be fine. You add "-Dsolr.log=" to make logs to go different directories. If you want logs to go to same directory but different files try updating log4j.properties. On 26 October

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Shawn Heisey
On 10/26/2015 12:08 PM, Upayavira wrote: > In the end, this needs to be replaced by HTTP APIs. In the meantime, a > switch to disable the solrconfig check sounds reasonable. > > Wanna create the ticket? Created: https://issues.apache.org/jira/browse/SOLR-8214 Thanks, Shawn

Re: Query differently or change fieldtype

2015-10-26 Thread Brian Narsi
That is right Ray, that is exactly what I found out and that is why I am asking the question. On Mon, Oct 26, 2015 at 2:19 PM, Ray Niu wrote: > I found the conf minGramSize="2",which will only create index with at least > 2 chars,j will not match > also

Re: Query differently or change fieldtype

2015-10-26 Thread Ray Niu
I found the conf minGramSize="2",which will only create index with at least 2 chars,j will not match also StandardTokenizerFactory will tokenize st j to st and j 2015年10月26日星期一,Brian Narsi 写道: > I have the following field type on a field ClientName: > >

Re: Google didn't help on this one!

2015-10-26 Thread Mikhail Khludnev
Mark, wiki and defauit configs are formally correct and just might be not specific enough for older spellcheckers. I closed SOLR-8063. On Wed, Sep 16, 2015 at 5:30 PM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Raised

RE: Querying Dynamic Fields

2015-10-26 Thread Matt Kuiper (Springblox)
Give the following a try - http://localhost:8983/solr/core_name/admin/luke?numTerms=0 Matt Matt Kuiper -Original Message- From: Patrick Hoeffel [mailto:patrick.hoef...@issinc.com] Sent: Monday, October 26, 2015 4:56 PM To: solr-user@lucene.apache.org Subject: Querying Dynamic Fields

Best strategy for indexing multiple tables with multiple fields

2015-10-26 Thread Daniel Valdivia
Hi, I’m new to the solr world, I’m in need of some experienced advice as I see I can do a lot of cool stuff with Solr, but I’m not sure which path to take so I don’t shoot myself in the foot with all this power :P I have several tables (225) in my application, which I’d like to add into a

Querying Dynamic Fields

2015-10-26 Thread Patrick Hoeffel
I have a simple Solr schema that uses dynamic fields to create most of my fields. This works great. Unfortunately, I now need to ask Solr to give me the names of the fields in the schema. I'm using: http://localhost:8983/solr/core/schema/fields This returns the statically defined fields, but

CloudSolrClient query /admin/info/system

2015-10-26 Thread Kevin Risden
I am trying to use CloudSolrClient to query information about the Solr server including version information. I found /admin/info/system and it seems to provide the information I am looking for. However, it looks like CloudSolrClient cannot query /admin/info since INFO_HANDLER_PATH [1] is not part

Solr Code structure Documentation

2015-10-26 Thread G.Sarwar
Hi all, I am new at SOLR and i would like to understand how its searching mechanism is working. I know its using apache lucene as a backend but how the calls are exactly working from Query page and which code/algorithm is being called and how. I have sucessfully configured it and running it on my

Re: Solr Code structure Documentation

2015-10-26 Thread Alexandre Rafalovitch
Well, the source code is all there, if you need to know _exactly_. Run it under Debug. Run it under paid IntelliJ with Chronos if you will be doing it a lot. Same with Admin to Solr, just open a developer console in the browser and you have every web call documented just when you want them.

Re: Does docValues impact termfreq ?

2015-10-26 Thread Erick Erickson
Do be aware that docValues can only be used for non-text types, i.e. numerics, strings and the like. Specifically, docValues are _not_ possible for solr.textField and docValues don't support analysis chains because the underlying primitive types don't. You'll get an error if you try to specify

Re: Solr hard commit

2015-10-26 Thread Rallavagu
Erick, Thanks for clarification. I was under impression that MMapDirectory is being used for both read/write operations. Now, I see how it is being used. Essentially, it only reads from MMapDirectory and writes directly to disk. So, the updated file(s) on the disk automatically read into

Re: Two seperate intance of Solr on the same machine

2015-10-26 Thread Jack Krupansky
Each instance should be installed in a separate directory. IOW, don't try running multiple Solr processes for the same data. -- Jack Krupansky On Mon, Oct 26, 2015 at 1:33 PM, Steven White wrote: > Hi, > > For reasons I have no control over, I'm required to run 2 (maybe

Re: Highlighting content field problem when using JiebaTokenizerFactory

2015-10-26 Thread Scott Chu
Take a look at Michael's 2 articles, they might help you calrify the idea of highlighting in Solr: Changing Bits: Lucene's TokenStreams are actually graphs! http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html Also take a look at 4th paragraph In his another article:

Re: Solr hard commit

2015-10-26 Thread Erick Erickson
You're really looking at this backwards. The MMapDirectory stuff is for Solr (Lucene, really) _reading_ data from closed segment files. When indexing, there are internal memory structures that are flushed to disk on commit, but these have nothing to do with MMapDirectory. So the question is

Re: Best strategy for indexing multiple tables with multiple fields

2015-10-26 Thread Erick Erickson
Well, all I can say is "if you find yourself trying to do lots of joins in Solr, go back to the drawing board" ;). Solr is a great search engine, but its ability to be used like a RDBMS is...er...limited. RDBMSs are great at what they do, but make pretty rotten search engines. Rather than think

Re: Highlighting content field problem when using JiebaTokenizerFactory

2015-10-26 Thread Scott Chu
Hi Edward, Took a lot of time to see if there's anything can help you to define the cause of your problem. Maybe this might help you a bit: [SOLR-4722] Highlighter which generates a list of query term position(s) for each item in a list of documents, or returns null if highlighting is

Re: Best strategy for indexing multiple tables with multiple fields

2015-10-26 Thread Walter Underwood
Most of the time, the best approach is to denormalize everything into one big virtual table. Think about a making a view, where each row is one document in Solr. That row needs everything that will be searched and everything that will be displayed, but nothing else. I’ve heard of installations

Re: Does docValues impact termfreq ?

2015-10-26 Thread Emir Arnautovic
If I got it right, you are using term query, use function to get TF as score, iterate all documents in results and sum up total number of occurrences of specific term in index? Is this only way you use index or this is side functionality? Thanks, Emir On 24.10.2015 22:28, Aki Balogh wrote:

Re: Does docValues impact termfreq ?

2015-10-26 Thread Emir Arnautovic
Hi Aki, IMO this is underuse of Solr (not to mention SolrCloud). I would recommend doing in memory document parsin (if you need something from Lucene/Solr analysis classes, use it) and use some other cache like solution to store term/total frequency pairs (you can try Redis). That way you

Re: using a custom update for all documents

2015-10-26 Thread Upayavira
On Mon, Oct 26, 2015, at 02:58 PM, Roxana Danger wrote: > Hello everyone, > Is there a way to update all the documents in the solr index using a > custom > update processor? You want to re-index all documents? If so, that's not really how update processors work. They trigger when a new

Re: Solr Pagination

2015-10-26 Thread Upayavira
On Sun, Oct 25, 2015, at 05:43 PM, Salman Ansari wrote: > Thanks guys for your responses. > > That's a very very large cache size. It is likely to use a VERY large > amount of heap, and autowarming up to 4096 entries at commit time might > take many *minutes*. Each filterCache entry is

solr-user-subscribe

2015-10-26 Thread Margherita Di Leo
-- Margherita Di Leo

using a custom update for all documents

2015-10-26 Thread Roxana Danger
Hello everyone, Is there a way to update all the documents in the solr index using a custom update processor? Thank you, Roxana

Re: Does docValues impact termfreq ?

2015-10-26 Thread Aki Balogh
Hi Emir, This is correct. This is the only way we use the index. Thanks, Aki On Mon, Oct 26, 2015 at 9:31 AM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > If I got it right, you are using term query, use function to get TF as > score, iterate all documents in results and sum up

Re: Does docValues impact termfreq ?

2015-10-26 Thread Scott Stults
Aki, does the sumtotaltermfreq function do what you need? On Mon, Oct 26, 2015 at 9:43 AM, Aki Balogh wrote: > Hi Emir, > > This is correct. This is the only way we use the index. > > Thanks, > Aki > > On Mon, Oct 26, 2015 at 9:31 AM, Emir Arnautovic < >

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Shawn Heisey
On 10/26/2015 2:23 AM, Adrian Liew wrote: > { > "responseHeader":{ > "status":0, > "QTime":1735}, > > "failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrExce > ption:Error from server at http://172.18.111.112:8983/solr: Error CREATEing > Solr > Core

Re: using a custom update for all documents

2015-10-26 Thread Roxana Danger
Thank you very much, Upayavira. I am indexing my documents with a DIH, but I need to use my custom update processor after the commit, not in the update.chain. Do update processors only work while adding new documents? In this case, I have tried two alternatives: 1) using the onImportEnd event.

Re: using a custom update for all documents

2015-10-26 Thread Alexandre Rafalovitch
Roxana, You've been asked a couple of times by several people to explain your business needs (level higher than Solr itself). As it is, you are slowly getting deeper and deeper into Solr's internals, where there might be an easier question if we know what you are trying to achieve. It is your

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Shalin Shekhar Mangar
+1 let's have upconfig complain loudly if the directory being uploaded doesn't have a solrconfig.xml On Mon, Oct 26, 2015 at 9:56 PM, Upayavira wrote: > On Mon, Oct 26, 2015, at 04:10 PM, Shawn Heisey wrote: >> On 10/26/2015 2:23 AM, Adrian Liew wrote: >> > { >> >

Two seperate intance of Solr on the same machine

2015-10-26 Thread Steven White
Hi, For reasons I have no control over, I'm required to run 2 (maybe more) instances of Solr on the same server (Windows and Linux). To be more specific, I will need to start each instance like so: > solr\bin start -p 8983 -s ..\instance_one > solr\bin start -p 8984 -s ..\instance_two >

Re: copy data between collection

2015-10-26 Thread Jeff Wartes
The “copy” command in this tool automatically does what Upayavira describes, including bringing the replicas up to date. (if any) https://github.com/whitepages/solrcloud_manager I’ve been using it as a mechanism for copying a collection into a new cluster (different ZK), but it should work

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Upayavira
On Mon, Oct 26, 2015, at 04:10 PM, Shawn Heisey wrote: > On 10/26/2015 2:23 AM, Adrian Liew wrote: > > { > > "responseHeader":{ > > "status":0, > > "QTime":1735}, > > > > "failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrExce > > ption:Error from server at

Zookeeper issue causing all nodes to fail

2015-10-26 Thread philippa griggs
Hello all, We have been experiencing some major solr issues. Solr 5.2.1 10 Shards each with a replica (20 nodes in total). Three external zookeepers 3.4.6 Node 19 went down, a short while after this occurred all our nodes were wiped out. The cloud diagram, live_nodes and clusterstate.json

Payload doesn't apply to WordDelimiterFilterFactory-generated tokens

2015-10-26 Thread Jamie Johnson
I came across this post ( http://lucene.472066.n3.nabble.com/Payload-doesn-t-apply-to-WordDelimiterFilterFactory-generated-tokens-td3136748.html) and tried to find a JIRA for this task. Was one ever created? If not I'd be happy to create it if this is still something that makes sense or if

Query differently or change fieldtype

2015-10-26 Thread Brian Narsi
I have the following field type on a field ClientName: For data where ClientName = st jude medical inc When querying I get the following: 1) st --> result = st jude medical inc (works correctly) 2) st j --> No results are returned (NOT correct) - Expect to find st jude medical

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Shawn Heisey
On 10/26/2015 11:27 AM, Shalin Shekhar Mangar wrote: > +1 let's have upconfig complain loudly if the directory being uploaded > doesn't have a solrconfig.xml +1 I like this idea, as long as there's some kind of "force" option to bypass that check. I can imagine situations where somebody might

Re: Solr.cmd cannot create collection in Solr 5.2.1

2015-10-26 Thread Upayavira
On Mon, Oct 26, 2015, at 06:04 PM, Shawn Heisey wrote: > On 10/26/2015 11:27 AM, Shalin Shekhar Mangar wrote: > > +1 let's have upconfig complain loudly if the directory being uploaded > > doesn't have a solrconfig.xml > > +1 > > I like this idea, as long as there's some kind of "force" option