Re: How To: Debuging the whole indexing process

2015-05-29 Thread Alexandre Rafalovitch
In production or in test? I assume in test. This level of detail usually implies some sort of Java debugger and java instrumentation enabled. E.g. Chronon, which is commercial but can be tried as a plugin with IntelliJ Idea full version trial. Regards, Alex On 29 May 2015 4:38 pm, Aman

Re: Ability to load solrcore.properties from zookeeper

2015-05-29 Thread Alan Woodward
Yeah, you could do it like that. But looking at it further, I think solrcore.properties is actually being loaded in entirely the wrong place - it should be done by whatever is creating the CoreDescriptor, and then passed in as a Properties object to the CD constructor. At the moment, you

Re: SolrCloud 4.8.0 - Snapshots directory take a lot of space

2015-05-29 Thread Vincenzo D'Amore
bump On Fri, May 8, 2015 at 4:45 PM, Vincenzo D'Amore v.dam...@gmail.com wrote: Hi All, Looking at data directory in my solrcloud cluster I have found a lot of old snapshot directory in Like these: snapshot.20150506003702765 snapshot.20150506003702760 snapshot.20150507002849492

Re: Index optimize runs in background.

2015-05-29 Thread Modassar Ather
I have not added any timeout in the indexer except zk client time out which is 30 seconds. I am simply calling client.close() at the end of indexing. The same code was not running in background for optimize with solr-4.10.3 and org.apache.solr.client.solrj.impl.CloudSolrServer. On Fri, May 29,

How To: Debuging the whole indexing process

2015-05-29 Thread Aman Tandon
Hi, I want to debug the whole indexing process, the life cycle of indexing process (each and every function call by going via function to function), from the posting of the data.xml to creation of various index files ( _fnm, _fdt, etc ). So how/what should I setup and start, please help. I will

Help for a field in my schema ?

2015-05-29 Thread Bruno Mannina
Dear Solr-Users, (SOLR 5.0 Ubuntu) I have xml files with tags like this claimXXYYY where XX is a language code like FR EN DE PT etc... (I don't know the number of language code I can have) and YYY is a number [1..999] i.e.: claimen1 claimen2 claimen3 claimfr1 claimfr2 claimfr3 I would like

Re: How to index 20 000 files with a command line ?

2015-05-29 Thread Sergey Shvets
Hello Bruno, You can use find command with exec attribute. regards Sergey Friday, May 29, 2015, 3:11:37 PM, you wrote: Dear Solr Users, Habitualy i use this command line to index my files: bin/post -c hbl /data/hbl-201522/*.xml but today I have a big update, so there are 20 000 xml files

Re: How to index 20 000 files with a command line ?

2015-05-29 Thread Bruno Mannina
oh yes like this: find /data/hbl-201522/-name *.xml -exec bin/post -c hbl {} \; ? Le 29/05/2015 14:15, Sergey Shvets a écrit : Hello Bruno, You can use find command with exec attribute. regards Sergey Friday, May 29, 2015, 3:11:37 PM, you wrote: Dear Solr Users, Habitualy i use

How to index 20 000 files with a command line ?

2015-05-29 Thread Bruno Mannina
Dear Solr Users, Habitualy i use this command line to index my files: bin/post -c hbl /data/hbl-201522/*.xml but today I have a big update, so there are 20 000 xml files (each files 1kox150ko) I get this error: Error: bin/post argument too long How could I index the whole directory ? Thanks

Re: Number of clustering labels to show

2015-05-29 Thread Stanislaw Osinski
Hi, The number of clusters primarily depends on the parameters of the specific clustering algorithm. If you're using the default Lingo algorithm, the number of clusters is governed by the LingoClusteringAlgorithm.desiredClusterCountBase parameter. Take a look at the documentation (

Re: docValues: Can we apply synonym

2015-05-29 Thread Alessandro Benedetti
Even if a little bit outdated, that query parser is really really cool to manage synonyms ! +1 ! 2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com: Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38

Re: Index optimize runs in background.

2015-05-29 Thread Erick Erickson
I'm not talking about you setting a timeout, but the underlying connection timing out... The 10 minutes then the indexer exits comment points in that direction. Best, Erick On Thu, May 28, 2015 at 11:43 PM, Modassar Ather modather1...@gmail.com wrote: I have not added any timeout in the

Re: docValues: Can we apply synonym

2015-05-29 Thread Erick Erickson
Do take time for performance testing with that parser. It can be slow depending on your data as I remember. That said it solves the problem it set out to solve so if it meets your SLAs, it can be a life-saver. Best, Erick On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti

Re: Ignoring the Document Cache per query

2015-05-29 Thread Bryan Bende
Thanks Erik. I realize this really makes no sense, but I was looking to work around a problem. Here is the scenario... Using Solr 5.1 we have a service that utilizes the new mlt query parser to get recommendations. So we start up the application, ask for recommendations for a document, and

Re: Help for a field in my schema ?

2015-05-29 Thread Erick Erickson
Well yes, but the second doesn't do what you say you want, bq: *claim *equal to all claimXXYYY (all languages, all numbers, indexed=true, stored false) (search not needed but must be displayed) You can search this field, but specifying it in a field list (fl) will return nothing, you need

user interface

2015-05-29 Thread Mustafa KIZILDAĞ
Hi, My name is Mustafa. I'm a master student at YTU in Turkey. I am doing a crawler for Voip problem for my job and scholl. I want to configure Solr's user interface. For example, I want to add an image or add a comment on user interface? I searched about it but could't find a good result.

Re: CLUSTERSTATUS timeout

2015-05-29 Thread Joseph Obernberger
I'm also getting this error with 5.1.0 and a 27 shard setup. null:org.apache.solr.common.SolrException: CLUSTERSTATUS the collection time out:180s at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:740) at

RE: When is too many fields in qf is too many?

2015-05-29 Thread Reitzel, Charles
Before giving up, I might try a copyTo fields per field group and see how that works. Won't that get you down to 10-20 fields per query and be stable wrt view changes? But Solr is column oriented, in that the core query logic is a scatter/gather over qf list. Perhaps there is a reason qf

RE: optimal shard assignment with low shard key cardinality using compositeId to enable shard splitting

2015-05-29 Thread Reitzel, Charles
Thanks, Erick. I appreciate the sanity check. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, May 28, 2015 5:50 PM To: solr-user@lucene.apache.org Subject: Re: optimal shard assignment with low shard key cardinality using compositeId to enable

Re: Ignoring the Document Cache per query

2015-05-29 Thread Erick Erickson
This is totally weird. The document cache should really have nothing to do with whether MLT returns documents or not AFAIK. So either I'm totally misunderstanding MLT, you're leaving out a step or there's some bug in Solr. The fact that setting the document cache to 0 changes the behavior, or

Re: Deleting Fields

2015-05-29 Thread Shawn Heisey
On 5/29/2015 5:08 PM, Joseph Obernberger wrote: Hi All - I have a lot of fields to delete, but noticed that once I started deleting them, I quickly ran out of heap space. Is delete-field a memory intensive operation? Should I delete one field, wait a while, then delete the next? I'm not

Re: How to setup solr in cluster

2015-05-29 Thread Erick Erickson
You really have to tell us more about what you mean. You have two problems to solve 1 putting Solr on all the nodes and starting/stopping it. Puppet or Chef help here, although it's perfectly possible to do this manually. 2 creating collecitons etc. For this you just need all your Solr instances

RE: How to setup solr in cluster

2015-05-29 Thread Purohit, Sumit
Thanks for the reply. I have tried example cloud setup using the link I mentioned. I am trying to setup solr on all 16 nodes + 1 external zookeeper on 1 of the node. That’s when I find out about Chef and Puppet. My problem is manually setting and start/stop solr does not seem that efficient

Re: docValues: Can we apply synonym

2015-05-29 Thread Aman Tandon
Hi Upayavira, How the copyField will help in my scenario when I have to add the synonym in docValue enable field. With Regards Aman Tandon On Sat, May 30, 2015 at 1:18 AM, Upayavira u...@odoko.co.uk wrote: Use copyField to clone the field for faceting purposes. Upayavira On Fri, May 29,

RE: How to setup solr in cluster

2015-05-29 Thread Purohit, Sumit
Sorry for this second email, but another problem of mine is : when I copy sorl folder on each node and start them, should I run it as 1 node cluster on each node and use the same name for collection ? OR I have to create individual Shard on each node. Thanks for your help. Thanks sumit

How to setup solr in cluster

2015-05-29 Thread Purohit, Sumit
Hi All, I am trying to setup solr on a cluster with 16 nodes. Only documentation I could find, talks about a local cluster which behaves like a real cluster. https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud I read about using tools like Chef or Puppet to configure

Deleting Fields

2015-05-29 Thread Joseph Obernberger
Hi All - I have a lot of fields to delete, but noticed that once I started deleting them, I quickly ran out of heap space. Is delete-field a memory intensive operation? Should I delete one field, wait a while, then delete the next? Thank you! -Joe

Re: Deleting Fields

2015-05-29 Thread Joseph Obernberger
Thank you Shawn - I'm referring to fields in the schema. With Solr 5, you can delete fields from the schema. https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField -Joe On 5/29/2015 7:30 PM, Shawn Heisey wrote: On 5/29/2015 5:08 PM, Joseph Obernberger wrote: Hi

Re: How to setup solr in cluster

2015-05-29 Thread Erick Erickson
None of the above. You simply start Solr on each node, then use the Collections API to create your collection. Solr will taker care of creating the individual replicas on each of the nodes with respect to the parameters you pass to the CREATE command. Best, Erick On Fri, May 29, 2015 at 5:46 PM,

Re: Deleting Fields

2015-05-29 Thread Erick Erickson
Yes, but deleting fields from the schema only means that _future_ documents will throw an undefined field error. All the documents currently in the index will retain that field. Why you're hitting an OOM is a mystery though. But delete field isn't removing the contents if indexed documents.

Re: docValues: Can we apply synonym

2015-05-29 Thread Aman Tandon
Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower than edismax that's why it is not in contrib right? (I am asking this question because we are using for all our searches to handle

Re: user interface

2015-05-29 Thread Erik Hatcher
Which user interface? Do you mean the admin UI? Or perhaps /browse? — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On May 29, 2015, at 1:34 PM, Mustafa KIZILDAĞ mustafakizilda...@gmail.com wrote: Hi, My name is Mustafa. I'm a

Re: How To: Debuging the whole indexing process

2015-05-29 Thread Aman Tandon
Thanks Alex, yes it for my testing to understand the code/process flow actually. Any other ideas. With Regards Aman Tandon On Fri, May 29, 2015 at 12:48 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: In production or in test? I assume in test. This level of detail usually implies some

Re: docValues: Can we apply synonym

2015-05-29 Thread Upayavira
Use copyField to clone the field for faceting purposes. Upayavira On Fri, May 29, 2015, at 08:06 PM, Aman Tandon wrote: Hi Erick, Thanks for suggestion, We are this query parser plugin ( *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word synonym. So it does work slower

Re: [solr 5.1] Looking for full text + collation search field

2015-05-29 Thread TK Solr
On 5/21/15, 5:19 AM, Björn Keil wrote: Thanks for the advice. I have tried the field type and it seems to do what it is supposed to in combination with a lower case filter. However, that raises another slight problem: German umlauts are supposed to be treated slightly different for the