Re: A bad idea to store core data directory over NAS?

2014-11-05 Thread Toke Eskildsen
On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote: My data center is out of SAN or local disk storage - is it a big no-no to store Solr core data folder over NAS? It depends on your NAS speed. Both Walter and David are right: It can perform really bad or quite satisfactory. We briefly

add and then delete same document before commit,

2014-11-05 Thread Matteo Grolla
Can anyone tell me the behavior of solr (and if it's consistent) when I do what follows: 1) add document x 2) delete document x 3) commit I've tried with solr 4.5.0 and document x get's indexed Matteo

Re: Analytics result for each Result Group

2014-11-05 Thread Talat Uyarer
I searched wiki pages about that. I do not find any documentation. If you help me I will be glad. Thanks 2014-11-04 11:34 GMT+02:00 Talat Uyarer ta...@uyarer.com: Hi folks, We use Analytics Component for median, max etc. I wonder if I use group.field parameter with analytics component, How

Best way to map holidays to corresponding date

2014-11-05 Thread Patrick Kirsch
Hey, maybe someone already faced the situation and could give me a hint. Given one query includes Easter or Sylvester I search for the best place to translate the string to the corresponding date. Is there any solr.Mapping*Factory for that? Do I need to implement it in a custom Solr Query

Re: indexing errors when storeOffsetsWithPositions=true in solr 4.9.1

2014-11-05 Thread Alan Woodward
Hi Min, Do you have the specific bit of text that caused this exception to be thrown? Alan Woodward www.flax.co.uk On 4 Nov 2014, at 23:15, Min L wrote: Hi All: I am using solr 4.9.1. and trying to use PostingsSolrHighlighter. But I got errors during indexing. I thought LUCENE-5111 has

Re: add and then delete same document before commit,

2014-11-05 Thread Alexandre Rafalovitch
Do you have soft commits enabled by any chance in solrconfig.XML? Regards, Alex On 05/11/2014 4:48 am, Matteo Grolla matteo.gro...@gmail.com wrote: Can anyone tell me the behavior of solr (and if it's consistent) when I do what follows: 1) add document x 2) delete document x 3) commit

Re: Best way to map holidays to corresponding date

2014-11-05 Thread Jack Krupansky
Unfortunately, a date is a non-analyzed field, so you can't do something like a synonym. Further, Holidays are repeating - every year - and the dates can vary, so they won't match exactly. Use an update request processor to examine the date field at values index time and look up and store

Re: add and then delete same document before commit,

2014-11-05 Thread Jack Krupansky
Document x doesn't exist - in terms of visibility - until the commit, so the delete will no-op since a query of Lucene will not see the uncommitted new document. -- Jack Krupansky -Original Message- From: Matteo Grolla Sent: Wednesday, November 5, 2014 4:47 AM To:

on regards to Solr and NoSQL storages integration

2014-11-05 Thread andrey prokopenko
Greetings Comrades. There were numerous requests and considerations on using Solr as both search engine and NoSQL store at the same time. While being an excellent tool as a search engine, Solr is looking not so good when it comes to storing documents and various stored fields, especially with big

Re: Best practice to setup schemas for documents having different structures

2014-11-05 Thread Erick Erickson
It Depends (tm). You have a lot of options, and it all depends on your data and use-case. In general, there is very little cost involved when a doc does _not_ use a field you've defined in a schema. That is, if you have 100's of fields defined and only use 10, the other 90 don't take up space in

Re: on regards to Solr and NoSQL storages integration

2014-11-05 Thread Alexandre Rafalovitch
On 5 November 2014 08:52, andrey prokopenko andrey4...@gmail.com wrote: I assume, there might be other developers, trying to solve similar problems, so I'd be interested to hear about similar attempts issues encountered while trying to implement such an integration between Solr and other

Re: A bad idea to store core data directory over NAS?

2014-11-05 Thread Walter Underwood
My experience was with Solr 1.2 and regular old NFS, so that was probably worst case. I was very surprised that it was that bad, though. So benchmark it before you assume it is fast enough. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Nov 5, 2014, at 12:27

Re: any difference between using collection vs. shard in URL?

2014-11-05 Thread Shalin Shekhar Mangar
There's no difference between the two. Even if you send updates to a shard url, it will still be forwarded to the right shard leader according to the hash of the id (assuming you're using the default compositeId router). Of course, if you happen to hit the right shard leader then it is just an

EarlyTerminatingCollectorException

2014-11-05 Thread Dirk Högemann
Our production Solr-Slaves-Cores (we have about 40 Cores (each has a moderate size about 10K documents to 90K documents)) produce many exceptions of type: 014-11-05 15:06:06.247 [searcherExecutor-158-thread-1] ERROR org.apache.solr.search.SolrCache: Error during auto-warming of

Re: any difference between using collection vs. shard in URL?

2014-11-05 Thread Ian Rose
Awesome, thanks. That's what I was hoping. Cheers, Ian On Wed, Nov 5, 2014 at 10:33 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: There's no difference between the two. Even if you send updates to a shard url, it will still be forwarded to the right shard leader according to the

Indexing nested document to support blockjoin queries in solr 4.10.1

2014-11-05 Thread henry cleland
Hello Guys, Im a noob on this mailing list so bear with me. Could i kindly get some help on this very elaborate problem? http://stackoverflow.com/questions/26759366/solr-blockjoin-indexing-for-solr-4-10-1 Thanks

Re: Best practice to setup schemas for documents having different structures

2014-11-05 Thread Ryan Cooke
We define all fields as wildcard fields with a suffix indicating field type. Then we can use something like Java annotations to map pojo variables to field types to append the correct suffix. This allows us to use one very generic schema among all of our collections and we rarely need to update

create new core based on named config set using the admin page

2014-11-05 Thread Andreas Hubold
Hi, I'm trying to use named config sets with a standalone Solr server (4.10.1). But it seems there's no way to create a new core based on a named config set using the Solr admin page. Or did I miss something? Should I open a JIRA issue? Regards, Andreas

Re: A bad idea to store core data directory over NAS?

2014-11-05 Thread Charlie Hull
In our experience yes, it's a bad idea. Charlie On 5 November 2014 10:27, Walter Underwood wun...@wunderwood.org wrote: My experience was with Solr 1.2 and regular old NFS, so that was probably worst case. I was very surprised that it was that bad, though. So benchmark it before you assume

Re: create new core based on named config set using the admin page

2014-11-05 Thread Ramzi Alqrainy
Sorry, I did not get your point, can you please elaborate more -- View this message in context: http://lucene.472066.n3.nabble.com/create-new-core-based-on-named-config-set-using-the-admin-page-tp4167850p4167860.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing nested document to support blockjoin queries in solr 4.10.1

2014-11-05 Thread Ramzi Alqrainy
You can model this in different ways, depending on your searching/faceting needs. Usually you'll use multivalued or dynamic fields. In the next examples I'll omit the field type, indexed and stored flags: field name=name type=text indexed=true stored=true / field name=c_name type=string

Schemaless configuration using 4.10.2/API returning 404

2014-11-05 Thread nbosecker
Hi all, I'm working on updating legacy Solr to 4.10.2 to use schemaless configuration. As such, I have added this snippet to solrconfig.xml per the docs: schemaFactory class=ManagedIndexSchemaFactory bool name=mutabletrue/bool str

Re: EarlyTerminatingCollectorException

2014-11-05 Thread Mikhail Khludnev
I'm wondered too, but it seems it warmups queryResultCache https://github.com/apache/lucene-solr/blob/20f9303f5e2378e2238a5381291414881ddb8172/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L522 at least this ERRORs broke nothing see

Re: add and then delete same document before commit,

2014-11-05 Thread Matteo Grolla
Perfectly clear, thanks a lot! Il giorno 05/nov/2014, alle ore 13:48, Jack Krupansky ha scritto: Document x doesn't exist - in terms of visibility - until the commit, so the delete will no-op since a query of Lucene will not see the uncommitted new document. -- Jack Krupansky

Re: A bad idea to store core data directory over NAS?

2014-11-05 Thread Gili Nachum
So NFS it's doable, and performance will vary by the grade of storage I'm getting and the volume of other activity on the NAS. Good to know it's not attributed to index corruptions in Lucene (failures to sync to disk and such). Update: Turns out that someone did find 50TB over SAN laying around

Re: on regards to Solr and NoSQL storages integration

2014-11-05 Thread Jack Krupansky
Take a look at DataStax Enterprise, which is basically Cassandra with Solr tightly integrated as an embedded search engine. Write and update your data in Cassandra and it will automatically be indexed in Solr, all in one cluster, so no need to build and maintain a separate SolrCloud cluster

SolrCloud shard distribution with Collections API

2014-11-05 Thread CTO직속IsabellePhan
Hello, I am testing a small SolrCloud cluster on 2 servers. I started 2 nodes on each server, so that each collection can have 2 shards with replication factor of 2. I am using below command from Collections API to create collection: curl '

Problem getting ngroups with format simple in Solr Cloud

2014-11-05 Thread Judith Silverman
I am seeing the same problem. I suspect that the patch for SOLR-5634 does not address the sharded case. Cheers, Judith

Is there a way to stop some hyphenated terms from being tokenized

2014-11-05 Thread Tang, Rebecca
Hi there, For some hyphenated terms, I want them to stay as is instead of being tokenized. For example: e-cigarette, e-cig, I-pad. I don't want them to be split into e and cig or I and pad because the single letter e and I produces too many false positive matches. Is there a way to tell

WordDelimiterFilterFactory and PatternReplaceCharFilterFactory

2014-11-05 Thread Jae Joo
Hi, Once I apply PatternReplaceCharFilterFactory to the input string, the position of token is changed. Here is an example. charFilter class=solr.PatternReplaceCharFilterFactory pattern=(lt;/?ce:italic[^]*) replacement=/ filter class=solr.WordDelimiterFilterFactory

Re: Is there a way to stop some hyphenated terms from being tokenized

2014-11-05 Thread Michael Della Bitta
Pretty sure what you need is called KeywordMarkerFilterFactory. |filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt /| On 11/5/14 17:24, Tang, Rebecca wrote: Hi there, For some hyphenated terms, I want them to stay as is instead of being tokenized. For example:

Re: SolrCloud shard distribution with Collections API

2014-11-05 Thread Erick Erickson
They should be pretty well distributed by default, but if you want to take manual control, you can use the createNodeSet param on CREATE (with replication factor of 1) and then ADDREPLICA with the node param to put replicas for shards exactly where you want. Best, Erick On Wed, Nov 5, 2014 at

solr security patch

2014-11-05 Thread kuttan palliyalil
Hi,  I am trying to apply the security patch(Solr-4470.patch) on solr 4.10.1 tag.  SOLR-4470.patch 14/Mar/14 16:15278 kB Getting error with the hunk failure. Could any one confirm if this the right patch for 4.10.1. Thank you so much RegardsRaj

Re: Indexing nested document to support blockjoin queries in solr 4.10.1

2014-11-05 Thread henry cleland
Hi Ramzi, Thanks for the response. I should have pointed out that this is an overly simplified view of my scenario at hand. Denormalisation is not an option for me as advised because of the sheer volume, nature and spread/skewness of the relations/schema of my actual data scenario. Also

转发: the solr shard-replica policy

2014-11-05 Thread sunjt
hi all,i have two machine and each of them has two solr instance.the problem is if i set the numShard =2 and replicationFactor=2,how could i ensure the shard leader and replica exist on different machine.could solr help me do it or i must do it myself ?

Solr Cloud Cross-Core Joins

2014-11-05 Thread Steve Davids
I have a use-case where I would like to capture click events for individual users so I can answer questions like show me everything with x text and that I have clicked before + the inverse of show me everything with x text that I have *not* clicked. I am currently doing this by sticking the event

Re: Solr Cloud Cross-Core Joins

2014-11-05 Thread Walter Underwood
I am curious why you are trying to do this with Solr. This is straightforward with other systems. I would use HBase for this. This could be really hard with Solr. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Nov 5, 2014, at 5:08 PM, Steve Davids

Re: Is there a way to stop some hyphenated terms from being tokenized

2014-11-05 Thread Michael Sokolov
You didn't describe your analysis chain, but maybe you are using WordDelimiterFilter to break up hyphenated words? If so, it has a protwords.txt feature that lets you specify exceptions -Mike On 11/5/2014 5:36 PM, Michael Della Bitta wrote: Pretty sure what you need is called

Re: solr security patch

2014-11-05 Thread Shawn Heisey
On 11/5/2014 5:04 PM, kuttan palliyalil wrote: I am trying to apply the security patch(Solr-4470.patch) on solr 4.10.1 tag. SOLR-4470.patch 14/Mar/14 16:15278 kB Getting error with the hunk failure. Could any one confirm if this the right patch for 4.10.1. The latest patch is almost 8

Re: solr security patch

2014-11-05 Thread kuttan palliyalil
Got it. Thank you Shawn. RegardsRaj On Wednesday, November 5, 2014 10:39 PM, Shawn Heisey apa...@elyograg.org wrote: On 11/5/2014 5:04 PM, kuttan palliyalil wrote: I am trying to apply the security patch(Solr-4470.patch) on solr 4.10.1 tag. SOLR-4470.patch 14/Mar/14 16:15278 kB

Re: SolrCloud shard distribution with Collections API

2014-11-05 Thread CTO직속IsabellePhan
Thanks for the advice Erick. Would you know what the underlying logic doing the shard distribution is? Does it depend on the order in which each node joined the cluster or does the collections api logic actually checks the node host IP to ensure even distribution? Best Regards, Isabelle On

Re: A bad idea to store core data directory over NAS?

2014-11-05 Thread Toke Eskildsen
On Wed, 2014-11-05 at 23:04 +0100, Gili Nachum wrote: Update: Turns out that someone did find 50TB over SAN laying around the data center for me to use, so I won't find out for my self how's life with NFS/NAS in the near future. There seems to be issues especially with NFS that you need to

Re: SolrCloud shard distribution with Collections API

2014-11-05 Thread Lee Chunki
Hi Isabelle, If I understood correctly your question, you can check shard distribution status at admin page http://localhost:8983/solr/#/~cloud if you started solr by using command like $ java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar (

What's the most efficient way to sort by number of terms matched?

2014-11-05 Thread Trey Grainger
Just curious if there are some suggestions here. The use case is fairly simple: Given a query like python OR solr OR hadoop, I want to sort results by number of keywords matched first, and by relevancy separately. I can think of ways to do this, but not efficiently. For example, I could do:

Re: Faceting return value of a function query?

2014-11-05 Thread Tom
Turns out that update processors perfectly suit me needs. I ended up using the StatelessScriptUpdateProcessor with a simple js script :-) On Mon Nov 03 2014 at 下午10:40:52 Yubing (Tom) Dong 董玉冰 tom.tung@gmail.com wrote: I see. Thank you! :-) Sent from my Android phone On Nov 3, 2014 9:35

Re: create new core based on named config set using the admin page

2014-11-05 Thread Andreas Hubold
Hi, Solr 4.8 introduced named config sets with https://issues.apache.org/jira/browse/SOLR-4478. You can create a new core based on a config set with the CoreAdmin API as described in https://cwiki.apache.org/confluence/display/solr/Config+Sets The Solr Admin page allows the creation of new