Re: `cat /dev/null > solr-8983-console.log` frees host's memory

2015-10-21 Thread Emir Arnautovic
Hi Eric, As Shawn explained, memory is freed because it was used to cache portion of log file. Since you are already with Sematext, I guess you are aware, but doesn't hurt to remind you that we also have Logsene that you can use to manage your logs: http://sematext.com/logsene/index.html

RE: DevOps question : auto deployment/setup of Solr & Zookeeper on medium-large clusters

2015-10-21 Thread Davis, Daniel (NIH/NLM) [C]
Susheel, Our puppet stuff is very close to our infrastructure, using specific Netapp volumes and such, and assuming some files come from NFS. It is also personally embarrassing to me that we still use NIS - doh! -Original Message- From: Susheel Kumar [mailto:susheel2...@gmail.com]

Re: LIX readability index calculation by solr

2015-10-21 Thread Walter Underwood
Can you reload all the content? If so, I would calculate this in an update request processor and put the result in its own field. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 21, 2015, at 2:53 AM, Roland Szűcs

Re: Efficiency of integer storage/use

2015-10-21 Thread Upayavira
What I'd say is that there are *substantial* optimisations done already when indexing terms, especially numerical ones, e.g. looking for common divisors. Look out for a talk by Adrien Grand at Berlin Buzzwords earlier this year for a taste of it. I don't know how much of this kind of optimisation

RE: DIH Caching with Delta Import

2015-10-21 Thread Dyer, James
The DIH Cache feature does not work with delta import. Actually, much of DIH does not work with delta import. The workaround you describe is similar to the approach described here: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport , which in my opinion is the best way to

Re: DevOps question : auto deployment/setup of Solr & Zookeeper on medium-large clusters

2015-10-21 Thread Dhutia, Devansh
We are using aws, and standardized deployments using Chef. As Jeff points out below, Exhibitor is a good tool to deploy with Zookeeper. We’ve had very good luck with it. On 10/20/15, 7:59 PM, "Jeff Wartes" wrote: > >If you’re using AWS, there’s this:

Re: Efficiency of integer storage/use

2015-10-21 Thread Robert Krüger
Thanks everyone, for your answers. I will probably make a simple parametric test pumping a solr index full of those integers with very limited range and then sorting by vector distances to see how the performance characteristics are. On Sun, Oct 18, 2015 at 9:08 PM, Mikhail Khludnev <

Re: result grouping on all documents

2015-10-21 Thread Emir Arnautovic
Hi Christian, It seems to me that you can use range faceting to get counts. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On 20.10.2015 17:05, Christian Reuschling wrote: Hi, we try to get the

How to get the join data by multiple cores?

2015-10-21 Thread Shuhei Suzuki
hello, What can I do to throw a query such as the following in Solr? SELECT child. *, parent. * FROM child JOIN parent WHERE child.parent_id = parent.id AND parent.tag = 'hoge'` child and parent is not that parent is more than in a many-to-one relationship. I try this but can not.

Index Multiple entity in one collection core

2015-10-21 Thread anurupborah2001
HI, I am having difficulty in indexing multiple entity in one collection..When i try to index only the entity defined at last gets index..Please help to assist as I am getting hard time to solve it. The below are the config : -- data-config.xml

Re: How to get the join data by multiple cores?

2015-10-21 Thread cai xingliang
{!join fromIndex=parent from=id to=parent_id}tag:hoge That should work. On Oct 22, 2015 12:35 PM, "Shuhei Suzuki" wrote: > hello, > What can I do to throw a query such as the following in Solr? > > SELECT > child. *, parent. * > FROM child > JOIN parent > WHERE

Wildcard "?" ?

2015-10-21 Thread Bruno Mannina
Dear Solr-user, I'm surprise to see in my SOLR 5.0 that the wildward ? replace inevitably 1 character. my request is: title:magnet? AND tire? SOLR found only title with a character after magnet and tire but don't found title with only magnet AND tire Do you know where can I tell to solr

Re: [newbie] Configuration for SolrCloud + DataImportHandler

2015-10-21 Thread Walter Underwood
Does the collection reload do a rolling reload of each node or does it do them all at once? We were planning on using the core reload on each system, one at a time. That would make sure the collection stays available. I read the documentation, it didn’t say anything about that. wunder Walter

Re: Wildcard "?" ?

2015-10-21 Thread Upayavira
No, you cannot tell Solr to handle wildcards differently. However, you can use regular expressions for searching: title:/magnet.?/ should do it. Upayavira On Wed, Oct 21, 2015, at 11:35 AM, Bruno Mannina wrote: > Dear Solr-user, > > I'm surprise to see in my SOLR 5.0 that the wildward ?

Re: [newbie] Configuration for SolrCloud + DataImportHandler

2015-10-21 Thread Erick Erickson
Please be very careful using the core admin UI for anything related to SolrCloud. In fact, I try to avoid using it at all. The reason is that it is very low-level, and it is very easy to use it incorrectly. For instance, reloading a core in a multi-replica setup (doesnt matter whether it's

Re: Wildcard "?" ?

2015-10-21 Thread Bruno Mannina
title:/magnet.?/ doesn't work for me because solr answers: |title = "Magnetic folding system"| but thanks to give me the idea to use regexp !!! Le 21/10/2015 18:46, Upayavira a écrit : No, you cannot tell Solr to handle wildcards differently. However, you can use regular expressions for

Re: LIX readability index calculation by solr

2015-10-21 Thread Roland Szűcs
Hi Wunder, Yes I can reload the documents it takes max 2-3 hours. I have never used the update request proccessor but I will check it on the Solr Wiki. Thanks your help Cheers, Roland 2015. okt. 21. dátummal, 17:25 időpontban Walter Underwood írta: > Can you reload

[newbie] questions about 3.6.0 and 4.x or 5.x ?

2015-10-21 Thread Robert Hume
Hello, I'm hoping to get some quick advice from the Solr gurus out there ... I’ve inherited a project that uses a Solr 3.6.0 deployment. (Several masters and several slaves – I think there are 6 Solr instances in total.) I’ve been tasked with investigating if upgrading our 3.6.0 deployment

Re: [newbie] questions about 3.6.0 and 4.x or 5.x ?

2015-10-21 Thread Shawn Heisey
On 10/21/2015 12:41 PM, Robert Hume wrote: > I've inherited a project that uses a Solr 3.6.0 deployment. (Several > masters and several slaves – I think there are 6 Solr instances in total.) > > I've been tasked with investigating if upgrading our 3.6.0 deployment will > improve performance –

Re: [newbie] questions about 3.6.0 and 4.x or 5.x ?

2015-10-21 Thread Erick Erickson
To chime in, in certain cases the memory requirements for 4x (and 5x) are _much_ improved, see: https://lucidworks.com/blog/2012/04/06/memory-comparisons-between-solr-3x-and-trunk/ But as Shawn says, it's not a magic bullet. Solr 5 requires Java 7, so that's one thing to be aware of. Plus, you

Re: `cat /dev/null > solr-8983-console.log` frees host's memory

2015-10-21 Thread Rajani Maski
This details in this link[1] might be of help. [1]https://support.lucidworks.com/hc/en-us/articles/207072137 On Wed, Oct 21, 2015 at 7:42 AM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi Eric, > As Shawn explained, memory is freed because it was used to cache portion > of log

Re: Wildcard "?" ?

2015-10-21 Thread Upayavira
regexp will match the whole term. So, if you have stemming on, magnetic may well stem to magnet, and that is the term against which the regexp is executed. If you want to do the regexp against the whole field, then you need to do it against a string version of that field. The process of using a

Re: [newbie] Configuration for SolrCloud + DataImportHandler

2015-10-21 Thread Erick Erickson
Hmmm, not entirely sure. It's perfectly reasonable to use the core admin API, just be careful with it especially for things like reload, it's pretty easy to have your cluster in an inconsistent state. Looks like the collections RELOAD command sends requests out all replicas at once. Under the

Re: `cat /dev/null > solr-8983-console.log` frees host's memory

2015-10-21 Thread Eric Torti
Thank you Shawn, Timothy, Emir and Rajani. Sorry, Shawn, I ended up cropping out the legend but you were right on your guess. Indeed, Timothy, this log is completely redundant. Will get rid of it soon. I'll look into the resources you all pointed out. Thanks! Best, Eric Torti On Wed, Oct 21,

Re: [newbie] Configuration for SolrCloud + DataImportHandler

2015-10-21 Thread Hangu Choi
Mikhail, I didn't understatnd that's what I need to do. thank you. but at the first moment, I am not doing well.. I am testing to change configuration in solrcloud, through this command ./zkcli.sh -zkhost localhost:9983 -cmd putfile /synonyms.txt

Re: LIX readability index calculation by solr

2015-10-21 Thread Toke Eskildsen
Roland Szűcs wrote: > My use case is that I have to calculate the LIX readability index for my > documents. [...] > *B* = Number of periods (defined by period, colon or capital first letter) [...] > Does anybody have idea how to get the number of "periods"? As the

Re: LIX readability index calculation by solr

2015-10-21 Thread Roland Szűcs
Thank Toke your quick response. All your suggestions seem to be very good idea. I found the capital letters also strange because of the names, places so I will skip this part as I do not need an absolute measure just a ranked order among my documents, cheers, Roland 2015. okt. 21. dátummal,

Re: [newbie] Configuration for SolrCloud + DataImportHandler

2015-10-21 Thread Hangu Choi
Mikhail, I solved the problem, I putfile to wrong path. /synonyms.txt should be /configs/gettingstarted/synonyms.txt . Regards, Hangu On Wed, Oct 21, 2015 at 4:17 PM, Hangu Choi wrote: > Mikhail, > > I didn't understatnd that's what I need to do. thank you. > > but at

LIX readability index calculation by solr

2015-10-21 Thread Roland Szűcs
Hi all, My use case is that I have to calculate the LIX readability index for my documents. *LIX = A/B + (C x 100)/A*, where *A* = Number of words *B* = Number of periods (defined by period, colon or capital first letter) *C* = Number of long words (More than 6 letters) A can easily be done if