Re: solr 4.7.2 mergeFactor/ Merge policy issue
Hi All, Here’s more update on where I am at with this. I enabled infoStream logging and quickly figured that I need to get rid of maxBufferedDocs. So Erick you were absolutely right on that. I increased my ramBufferSize to 100MB and reduced maxMergeAtOnce to 3 and segmentsPerTier to 3 as well. My config looks like this indexConfig useCompoundFilefalse/useCompoundFile ramBufferSizeMB100/ramBufferSizeMB !--maxMergeSizeForForcedMerge9223372036854775807/maxMergeSizeForForcedMerge-- mergePolicy class=org.apache.lucene.index.TieredMergePolicy int name=maxMergeAtOnce3/int int name=segmentsPerTier3/int /mergePolicy mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler/ infoStream file=“/tmp/INFOSTREAM.txt”true/infoStream /indexConfig I am attaching a sample infostream log file. In the infoStream logs though you an see how the segments keep on adding and it shows (just an example ) allowedSegmentCount=10 vs count=9 (eligible count=9) tooBigCount=0 I looked at TieredMergePolicy.java to see how allowedSegmentCount is getting calculated // Compute max allowed segs in the index long levelSize = minSegmentBytes; long bytesLeft = totIndexBytes; double allowedSegCount = 0; while(true) { final double segCountLevel = bytesLeft / (double) levelSize; if (segCountLevel segsPerTier) { allowedSegCount += Math.ceil(segCountLevel); break; } allowedSegCount += segsPerTier; bytesLeft -= segsPerTier * levelSize; levelSize *= maxMergeAtOnce; } int allowedSegCountInt = (int) allowedSegCount; and the minSegmentBytes is calculated as follows // Compute total index bytes print details about the index long totIndexBytes = 0; long minSegmentBytes = Long.MAX_VALUE; for(SegmentInfoPerCommit info : infosSorted) { final long segBytes = size(info); if (verbose()) { String extra = merging.contains(info) ? [merging] : ; if (segBytes = maxMergedSegmentBytes/2.0) { extra += [skip: too large]; } else if (segBytes floorSegmentBytes) { extra += [floored]; } message( seg= + writer.get().segString(info) + size= + String.format(Locale.ROOT, %.3f, segBytes/1024/1024.) + MB + extra); } minSegmentBytes = Math.min(segBytes, minSegmentBytes); // Accum total byte size totIndexBytes += segBytes; } any input is welcome. thanks, Summer On Mar 5, 2015, at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote: I would, BTW, either just get rid of the maxBufferedDocs all together or make it much higher, i.e. 10. I don't think this is really your problem, but you're creating a lot of segments here. But I'm kind of at a loss as to what would be different about your setup. Is there _any_ chance that you have some secondary process looking at your index that's maintaining open searchers? Any custom code that's perhaps failing to close searchers? Is this a Unix or Windows system? And just to be really clear, you _only_ seeing more segments being added, right? If you're only counting files in the index directory, it's _possible_ that merging is happening, you're just seeing new files take the place of old ones. Best, Erick On Wed, Mar 4, 2015 at 7:12 PM, Shawn Heisey apa...@elyograg.org wrote: On 3/4/2015 4:12 PM, Erick Erickson wrote: I _think_, but don't know for sure, that the merging stuff doesn't get triggered until you commit, it doesn't just happen. Shot in the dark... I believe that new segments are created when the indexing buffer (ramBufferSizeMB) fills up, even without commits. I'm pretty sure that anytime a new segment is created, the merge policy is checked to see whether a merge is needed. Thanks, Shawn
Re: solr cloud does not start with many collections
It would be a huge step forward if one could have several hundreds of Solr collections, but only have a small portion of them opened/loaded at the same time. This is similar to ElasticSearch's close index api, listed here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html . I've opened an issue to implement the same in Solr here a few months ago: https://issues.apache.org/jira/browse/SOLR-6399 On Thu, Mar 5, 2015 at 4:42 PM, Damien Kamerman dami...@gmail.com wrote: I've tried a few variations, with 3 x ZK, 6 X nodes, solr 4.10.3, solr 5.0 without any success and no real difference. There is a tipping point at around 3,000-4,000 cores (varies depending on hardware) from where I can restart the cloud OK within ~4min, to the cloud not working and continuous 'conflicting information about the leader of shard' warnings. On 5 March 2015 at 14:15, Shawn Heisey apa...@elyograg.org wrote: On 3/4/2015 5:37 PM, Damien Kamerman wrote: I'm running on Solaris x86, I have plenty of memory and no real limits # plimit 15560 15560: /opt1/jdk/bin/java -d64 -server -Xss512k -Xms32G -Xmx32G -XX:MaxMetasp resource current maximum time(seconds) unlimited unlimited file(blocks) unlimited unlimited data(kbytes) unlimited unlimited stack(kbytes) unlimited unlimited coredump(blocks) unlimited unlimited nofiles(descriptors) 65536 65536 vmemory(kbytes) unlimited unlimited I've been testing with 3 nodes, and that seems OK up to around 3,000 cores total. I'm thinking of testing with more nodes. I have opened an issue for the problems I encountered while recreating a config similar to yours, which I have been doing on Linux. https://issues.apache.org/jira/browse/SOLR-7191 It's possible that the only thing the issue will lead to is improvements in the documentation, but I'm hopeful that there will be code improvements too. Thanks, Shawn -- Damien Kamerman
Solr query to match document templates - sort of a reverse wildcard match
If I have SOLR document with field value such as: a ? c ? e And I want a phrase query such as a b c d e to match that document. So: q:a b c d e -- return doc with a ? c ? e as field value for q field. Is this possible, or is there a way it can be done with a plug-in using lower level Lucene SDK? Maybe some custom implementation of TermQuery where value of ? always matches any term in the query? Thanks! Robert Stewart
Re: ExpandComponent not expanding
I did more testing following your question... And now all makes sense and I think that maybe a more clear explanation on documentation can help. I was using Grouping where a group is created also if only one element is present, I have inferred that expanded section was showing ALL collapsed records not ALL collapsed record after the one returned as group head. A this point ExpandComponent works well sorry for the false allarm. Regards. Dario On 6/03/2015 14:26, Joel Bernstein wrote: The expand component only displays the groups heads when it finds expanded documents in the group. And it only expands for the current page. Are you finding situations where there are group heads on the page, that have child documents that are not being expanded? Joel Bernstein Search Engineer at Heliosearch On Fri, Mar 6, 2015 at 7:17 AM, Dario Rigolin da...@comperio.it wrote: I'm using Solr 4.10.1 and FieldCollapsing but when adding expand=true and activating ExpandComponent the expanded section into result contains only one group head and not all group heads present into the result. I don't know if this is the intended behaviour. Using a query q=*:* the expanded section increase the number of group heads but not all 10 heads group are present. Also removing max= parameter on !collapse makes display couple of more heads but not all . Regards Example of response with only one group head into expanded but 10 are returned. response script/ lstname=responseHeader intname=status 0 /int intname=QTime 20 /int lstname=params strname=expand.rows 2 /str strname=expand.sortsdate asc/str strname=fl id /str strname=q title:(test search) /str strname=expand true /str strname=fq {!collapse field=group_key max=sdate} /str /lst /lst resultname=responsenumFound=120start=0 doc strname=id test:catalog:713515 /str /doc doc strname=id test:catalog:126861 /str /doc doc strname=id test:catalog:88797 /str /doc doc strname=id test:catalog:91760 /str /doc doc strname=id test:catalog:14095 /str /doc doc strname=id test:catalog:60616 /str /doc doc strname=id test:catalog:31539 /str /doc doc strname=id test:catalog:29449 /str /doc doc strname=id test:catalog:146638 /str /doc doc strname=id test:catalog:137554 /str /doc /result lstname=expanded resultname=collapse_value_2342numFound=3start=0 doc strname=id test:catalog:21 /str /doc doc strname=id test:catalog:330659 /str /doc /result /lst head/ /response
Re: Solrcloud Index corruption
bq: You say in our case some docs didn't made it to the node, but that's not really true: the docs can be found on the corrupted nodes when I search on ID. The docs are also complete. The problem is that the docs do not appear when I filter on certain fields this _sounds_ like you somehow don't have indexed=true set for the field in question. But it also sounds like you're saying that search on that field works on some nodes but not on others, I'm assuming you're adding distrib=false to verify this. It shouldn't be possible to have different schema.xml files on the different nodes, but you might try checking through the admin UI. Network burps shouldn't be related here. If the content is stored, then the info made it to Solr intact, so this issue shouldn't be related to that. Sounds like it may just be the bugs Mark is referencing, sorry I don't have the JIRA numbers right off. Best, Erick On Thu, Mar 5, 2015 at 4:46 PM, Shawn Heisey apa...@elyograg.org wrote: On 3/5/2015 3:13 PM, Martin de Vries wrote: I understand there is not a master in SolrCloud. In our case we use haproxy as a load balancer for every request. So when indexing every document will be sent to a different solr server, immediately after each other. Maybe SolrCloud is not able to handle that correctly? SolrCloud can handle that correctly, but currently sending index updates to a core that is not the leader of the shard will incur a significant performance hit, compared to always sending updates to the correct core. A small performance penalty would be understandable, because the request must be redirected, but what actually happens is a much larger penalty than anyone expected. We have an issue in Jira to investigate that performance issue and make it work as efficiently as possible. Indexing batches of documents is recommended, not sending one document per update request. General performance problems with Solr itself can lead to extremely odd and unpredictable behavior from SolrCloud. Most often these kinds of performance problems are related in some way to memory, either the java heap or available memory in the system. http://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn
Re: Frequency of Suggestion are varying from original Frequency in index
do you use solrcloud?maybe your suggestion is not support distribute 2015-03-04 22:39 GMT+08:00 Nitin Solanki nitinml...@gmail.com: Hi.. I have a term(who) where original frequency of who is 191 but when I get suggestion of who it gives me 90. Why? Example : *Original Frequency* comes like: spellcheck:{ suggestions:[ who,{ numFound:1, startOffset:1, endOffset:4, origFreq:*191*, correctlySpelled,false]}} While In *Suggestion*, it gives like: spellcheck:{ suggestions:[ whs,{ numFound:1, startOffset:1, endOffset:4, origFreq:0, suggestion:[{ word:who, freq:*90*}]}, correctlySpelled,false]}} Why it is so? I am using StandardTokenizerFactory with ShingleFilterFactory in Schema.xml..
Re: Core admin: create new core
Try - bin/solr create -c inventory On Mar 6, 2015, at 05:25, manju16832003 manju16832...@gmail.com wrote: Solr 5 has been released. I was just giving a try and come across the same issue. As I heard over from some documentation, Solr 5 doesn't come with default core (example in earlier versions). And this requires us to create a core from Solr Admin. When I tried to create the core, I get the following error Error CREATEing SolrCore 'inventory': Unable to create core [inventory] Caused by: Can't find resource 'solrconfig.xml' in classpath or '/Users/manjunath.reddy/Programming/Solr/solr-5.0.0/server/solr/inventory/conf' So I had to manually create the core based on my previous experience with Solr 4.10 version. I guess its quite misleading for new users of Solr. I like the older versions of Solr that comes with default cores so that It would be easier to follow up on. I have attached screen shots for the reference. Is there a work around for this? http://lucene.472066.n3.nabble.com/file/n4191378/solr-1.png http://lucene.472066.n3.nabble.com/file/n4191378/solr-2.png -- View this message in context: http://lucene.472066.n3.nabble.com/Core-admin-create-new-core-tp4099127p4191378.html Sent from the Solr - User mailing list archive at Nabble.com.
Calling solr Page with search query
Hello, i'm looking for a solution for the following situation: I have a website consisting of two sites. One site is called home and one is called search. On the search site i have embeded solr via iframe. On the home site there should be a search field. When the search field is fired it should open the search site with solr showing the search query and result (it should look like as if i would have used the search field directly on the search site). I would be very thankfull for any hints in connecting these parts! Regards Jochen
RE: Cores and and ranking (search quality)
Help me understand this better (regarding ranking). If I have two docs that are 100% identical with the exception of uid (which is stored but not indexed). In a single core setup, if I search xyz such that those 2 docs end up ranking as #1 and #2. When I switch over to two core setup, doc-A goes to core-A (which has 10 records) and doc-B goes to core-B (which has 100,000 records). Now, are you saying in 2 core setup if I search on xyz (just like in singe core setup) this time I will not see doc-A and doc-B as #1 and #2 in ranking? That is, are you saying doc-A may now be somewhere at the top / bottom far away from doc-B? If so, which will be #1: the doc off core-A (that has 10 records) or doc-B off core-B (that has 100,000 records)? If I got all this right, are you saying SOLR-1632 will fix this issue such that the end result will now be as if I had 1 core? - MJ -Original Message- From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] Sent: Thursday, March 5, 2015 9:06 AM To: solr-user@lucene.apache.org Subject: Re: Cores and and ranking (search quality) On Thu, 2015-03-05 at 14:34 +0100, johnmu...@aol.com wrote: My question is this: if I put my data in multiple cores and use distributed search will the ranking be different if I had all my data in a single core? Yes, it will be different. The practical impact depends on how homogeneous your data are across the shards and how large your shards are. If you have small and dissimilar shards, your ranking will suffer a lot. Work is being done to remedy this: https://issues.apache.org/jira/browse/SOLR-1632 Also, will facet and more-like-this quality / result be the same? It is not formally guaranteed, but for most practical purposes, faceting on multi-shards will give you the same results as single-shards. I don't know about more-like-this. My guess is that it will be affected in the same way that standard searches are. Also, reading the distributed search wiki (http://wiki.apache.org/solr/DistributedSearch) it looks like Solr does the search and result merging (all I have to do is issue a search), is this correct? Yes. From a user-perspective, searches are no different. - Toke Eskildsen, State and University Library, Denmark
Check the return of suggestions
Hello everyone. I'm working with Solr 4.3. I use the Spellechecker component which gives me suggestions as i expect. I will explain my problem with an example : I am querying /cartouchhe/instead of /cartouche/. I obtain these suggestions array (size=5) 0 = array (size=2) 'word' = *string 'cartouche' (length=9)* 'freq' = *int 1519* 1 = array (size=2) 'word' = *string 'touches' (length=7)* 'freq' =* int 55* 2 = array (size=2) 'word' = *string 'cartouches' (length=10)* 'freq' =*int 32* 3 = array (size=2) 'word' =* string 'caoutchoucs' (length=11)* 'freq' =* int 16* 4 = array (size=2) 'word' = *string 'cartonnees' (length=10)* 'freq' =* int 15* This is what I want == OK. The problem is that when I query /cartouche/or /cartouches/, I exactly have the same results because for both query, the term that will be searching into my index is /cartouch/. Is there a way with Solr to fix this kind of problem ie check that 2 collations will not return exactly the same results? Thanks for your answers, Alex. -- View this message in context: http://lucene.472066.n3.nabble.com/Check-the-return-of-suggestions-tp4191383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Frequency of Suggestion are varying from original Frequency in index
I think these frequencies are not the frequence of the term in the same index : - original frequency represents the number of results that you have in lucene index when you query who. - suggestion frequency is the number of results of this term in the spellcheck dictionnary. I guess you're using /solr.IndexBasedSpellChecker/ ! -- View this message in context: http://lucene.472066.n3.nabble.com/Frequency-of-Suggestion-are-varying-from-original-Frequency-in-index-tp4190927p4191397.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ExpandComponent not expanding
The expand component only displays the groups heads when it finds expanded documents in the group. And it only expands for the current page. Are you finding situations where there are group heads on the page, that have child documents that are not being expanded? Joel Bernstein Search Engineer at Heliosearch On Fri, Mar 6, 2015 at 7:17 AM, Dario Rigolin da...@comperio.it wrote: I'm using Solr 4.10.1 and FieldCollapsing but when adding expand=true and activating ExpandComponent the expanded section into result contains only one group head and not all group heads present into the result. I don't know if this is the intended behaviour. Using a query q=*:* the expanded section increase the number of group heads but not all 10 heads group are present. Also removing max= parameter on !collapse makes display couple of more heads but not all . Regards Example of response with only one group head into expanded but 10 are returned. response script/ lstname=responseHeader intname=status 0 /int intname=QTime 20 /int lstname=params strname=expand.rows 2 /str strname=expand.sortsdate asc/str strname=fl id /str strname=q title:(test search) /str strname=expand true /str strname=fq {!collapse field=group_key max=sdate} /str /lst /lst resultname=responsenumFound=120start=0 doc strname=id test:catalog:713515 /str /doc doc strname=id test:catalog:126861 /str /doc doc strname=id test:catalog:88797 /str /doc doc strname=id test:catalog:91760 /str /doc doc strname=id test:catalog:14095 /str /doc doc strname=id test:catalog:60616 /str /doc doc strname=id test:catalog:31539 /str /doc doc strname=id test:catalog:29449 /str /doc doc strname=id test:catalog:146638 /str /doc doc strname=id test:catalog:137554 /str /doc /result lstname=expanded resultname=collapse_value_2342numFound=3start=0 doc strname=id test:catalog:21 /str /doc doc strname=id test:catalog:330659 /str /doc /result /lst head/ /response
Re: Core admin: create new core
On 3/6/2015 3:25 AM, manju16832003 wrote: Solr 5 has been released. I was just giving a try and come across the same issue. As I heard over from some documentation, Solr 5 doesn't come with default core (example in earlier versions). And this requires us to create a core from Solr Admin. When I tried to create the core, I get the following error Error CREATEing SolrCore 'inventory': Unable to create core [inventory] Caused by: Can't find resource 'solrconfig.xml' in classpath or '/Users/manjunath.reddy/Programming/Solr/solr-5.0.0/server/solr/inventory/conf' So I had to manually create the core based on my previous experience with Solr 4.10 version. I guess its quite misleading for new users of Solr. I like the older versions of Solr that comes with default cores so that It would be easier to follow up on. Unless you are in SolrCloud mode, creating cores via the admin UI (or the /admin/cores HTTP API) requires that the core directory and its conf subdirectory with solrconfig.xml, schema.xml, and other potential files must already exist in the indicated location. There's a note right on the screen for Add Core that says this: instanceDir and dataDir need to exist before you can create the core. This was the case for 4.x as well as 5.0. That note is slightly misleading ... the dataDir does not need to exist, just instanceDir and the conf directory. Solr will create the dataDir and its contents, if the user running Solr has permission. There is a configsets functionality that's new in recent versions which very likely will make it possible to create a core completely from scratch within the admin UI in non-cloud mode, but I do not know anything about using it, and I do not think the functionality is exposed in the admin UI yet. Learning about cores and/or collections and how to create them is a hugely important part of using Solr. In 4.x, users did not need to do anything to get their first core, and that fact has led to many problems. New users don't know how to add a core, and many do not even know about cores at all. This requires that they must learn about the core/collection concepts, and many of them cannot find any info about the procedure, so they ask for help. I am glad to help out both here and on the IRC channel, but it improves the experience of everyone involved if users become familiar with the concept and methods on their own. Thanks, Shawn
ExpandComponent not expanding
I'm using Solr 4.10.1 and FieldCollapsing but when adding expand=true and activating ExpandComponent the expanded section into result contains only one group head and not all group heads present into the result. I don't know if this is the intended behaviour. Using a query q=*:* the expanded section increase the number of group heads but not all 10 heads group are present. Also removing max= parameter on !collapse makes display couple of more heads but not all . Regards Example of response with only one group head into expanded but 10 are returned. response script/ lstname=responseHeader intname=status 0 /int intname=QTime 20 /int lstname=params strname=expand.rows 2 /str strname=expand.sortsdate asc/str strname=fl id /str strname=q title:(test search) /str strname=expand true /str strname=fq {!collapse field=group_key max=sdate} /str /lst /lst resultname=responsenumFound=120start=0 doc strname=id test:catalog:713515 /str /doc doc strname=id test:catalog:126861 /str /doc doc strname=id test:catalog:88797 /str /doc doc strname=id test:catalog:91760 /str /doc doc strname=id test:catalog:14095 /str /doc doc strname=id test:catalog:60616 /str /doc doc strname=id test:catalog:31539 /str /doc doc strname=id test:catalog:29449 /str /doc doc strname=id test:catalog:146638 /str /doc doc strname=id test:catalog:137554 /str /doc /result lstname=expanded resultname=collapse_value_2342numFound=3start=0 doc strname=id test:catalog:21 /str /doc doc strname=id test:catalog:330659 /str /doc /result /lst head/ /response
Order of defining fields and dynamic fields in schema.xml
Hi, I am running solr 5 using basic_configs and have a questions about the order of defining fields and dynamic fields in the schema.xml file? For example, there is a field hierarchy.of.fields.Project I am capturing as below as text_en_splitting, but the rest of the fields in this hierarchy, I would like as text_en Since the dynamicField with * is technically spanning over the Project field, should its definition go above, or below the Project field? field name=hierarchy.of.fields.Project type=text_en_splitting indexed=true stored=true multiValued=true required=false / dynamicField name=hierarchy.of.fields.* type=text_en indexed=true stored=true multiValued=true required=false / Or this case, I have a hierarchy where currently only one field should be captured another.hierarchy.of.fields.Description, the rest for now should be just ignored. Is here any significance of which definition comes first? dynamicField name=another.hierarchy.of.* type=text_en indexed=false stored=false multiValued=true required=false / dynamicField name=another.hierarchy.of.fields.Description type=text_enindexed=true stored=true multiValued=true required=false / Thanks for any hints, Tom
Re: Order of defining fields and dynamic fields in schema.xml
I don't believe the order in file matters for anything apart from initParams section. The longer - more specific one - matches first. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 6 March 2015 at 11:21, Tom Devel deve...@gmail.com wrote: Hi, I am running solr 5 using basic_configs and have a questions about the order of defining fields and dynamic fields in the schema.xml file? For example, there is a field hierarchy.of.fields.Project I am capturing as below as text_en_splitting, but the rest of the fields in this hierarchy, I would like as text_en Since the dynamicField with * is technically spanning over the Project field, should its definition go above, or below the Project field? field name=hierarchy.of.fields.Project type=text_en_splitting indexed=true stored=true multiValued=true required=false / dynamicField name=hierarchy.of.fields.* type=text_en indexed=true stored=true multiValued=true required=false / Or this case, I have a hierarchy where currently only one field should be captured another.hierarchy.of.fields.Description, the rest for now should be just ignored. Is here any significance of which definition comes first? dynamicField name=another.hierarchy.of.* type=text_en indexed=false stored=false multiValued=true required=false / dynamicField name=another.hierarchy.of.fields.Description type=text_enindexed=true stored=true multiValued=true required=false / Thanks for any hints, Tom
Re: SolrCloud default shard assignment order not correct
On 3/6/2015 1:34 AM, Shawn Heisey wrote: In Solr 5.0, the cloud graph is sorting the collections by name. The shard names also appear to be sorted -- all the collections I have on the example cloud setup only have two shards, so I really can't be sure. It might also be sorting the replicas within each shard. I built a collection that would tell me what exactly is sorted in Solr 5.0. The collections are sorted and the shards are sorted, but the replicas are NOT sorted. Because there are normally only a few replicas and the leader is clearly marked, I don't see that as a problem, but if you really want them sorted, feel free to open an issue in Jira. Screenshot: https://www.dropbox.com/s/yzkubdbj86dbkda/solr5-cloud-graph-sorting.png?dl=0 SOLR project in Jira: https://issues.apache.org/jira/browse/SOLR Thanks, Shawn
Apache Solr Reference Guide 5.0
Greetings, I was looking at the PDF version of the Apache Solr Reference Guide 5.0 and noticed that it has no TOC nor any section numbering. http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf The lack of a TOC and section headings makes navigation difficult. I have just started making suggestions on the documentation and was wondering if there is a reason why the TOC and section headings are missing? (that isn't apparent from the document) Thanks! Hope everyone is near a great weekend! Patrick
Re: Order of defining fields and dynamic fields in schema.xml
Thats good to know. On http://wiki.apache.org/solr/SchemaXml it also states about dynamicFields that you can create field rules that Solr will use to understand what datatype should be used whenever it is given a field name that is not explicitly defined, but matches a prefix or suffix used in a dynamicField. Thanks On Fri, Mar 6, 2015 at 10:43 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: I don't believe the order in file matters for anything apart from initParams section. The longer - more specific one - matches first. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 6 March 2015 at 11:21, Tom Devel deve...@gmail.com wrote: Hi, I am running solr 5 using basic_configs and have a questions about the order of defining fields and dynamic fields in the schema.xml file? For example, there is a field hierarchy.of.fields.Project I am capturing as below as text_en_splitting, but the rest of the fields in this hierarchy, I would like as text_en Since the dynamicField with * is technically spanning over the Project field, should its definition go above, or below the Project field? field name=hierarchy.of.fields.Project type=text_en_splitting indexed=true stored=true multiValued=true required=false / dynamicField name=hierarchy.of.fields.* type=text_en indexed=true stored=true multiValued=true required=false / Or this case, I have a hierarchy where currently only one field should be captured another.hierarchy.of.fields.Description, the rest for now should be just ignored. Is here any significance of which definition comes first? dynamicField name=another.hierarchy.of.* type=text_en indexed=false stored=false multiValued=true required=false / dynamicField name=another.hierarchy.of.fields.Description type=text_enindexed=true stored=true multiValued=true required=false / Thanks for any hints, Tom
Re: How to start solr in solr cloud mode using external zookeeper ?
zkhost=hostnames, port=some port variables in your solr.xml should work? I have tested this with tomcat not with jetty, this stays with your config. Rajesh. On Mar 5, 2015 9:20 PM, Aman Tandon amantandon...@gmail.com wrote: Thanks shamik :) With Regards Aman Tandon On Fri, Mar 6, 2015 at 3:30 AM, shamik sham...@gmail.com wrote: The other way you can do that is to specify the startup parameters in solr.in.sh. Example : SOLR_MODE=solrcloud ZK_HOST=zoohost1:2181,zoohost2:2181,zoohost3:2181 SOLR_PORT=4567 You can simply start solr by running ./solr start -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-start-solr-in-solr-cloud-mode-using-external-zookeeper-tp4190630p4191286.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Core admin: create new core
Solr 5 has been released. I was just giving a try and come across the same issue. As I heard over from some documentation, Solr 5 doesn't come with default core (example in earlier versions). And this requires us to create a core from Solr Admin. When I tried to create the core, I get the following error Error CREATEing SolrCore 'inventory': Unable to create core [inventory] Caused by: Can't find resource 'solrconfig.xml' in classpath or '/Users/manjunath.reddy/Programming/Solr/solr-5.0.0/server/solr/inventory/conf' So I had to manually create the core based on my previous experience with Solr 4.10 version. I guess its quite misleading for new users of Solr. I like the older versions of Solr that comes with default cores so that It would be easier to follow up on. I have attached screen shots for the reference. Is there a work around for this? http://lucene.472066.n3.nabble.com/file/n4191378/solr-1.png http://lucene.472066.n3.nabble.com/file/n4191378/solr-2.png -- View this message in context: http://lucene.472066.n3.nabble.com/Core-admin-create-new-core-tp4099127p4191378.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Apache Solr Reference Guide 5.0
Shawn, Thanks! I was using Document Viewer and not Adobe Acrobat so was unclear. The TOC I meant was as in a traditional print publication with section #s, etc. Not a navigation TOC sans numbering as in Adobe. The Confluence documentation (I can't see the actual stylesheet in use, I don't think) here: https://confluence.atlassian.com/display/DOC/Customising+Exports+to+PDF Says: * Disabling the Table of Contents To prevent the table of contents from being generated in your PDF document, add the div.toc-macro rule to the PDF Stylesheet and set its display property to none: * Which is why I was asking if there was a reason for the TOC and section numbering not appearing. They can be defeated but that doesn't appear to be the default setting. This came up because a section said it would cover topics N - S and I could not determine if all those topics fell in that section or not. Thanks! Hope you are having a great day! Patrick On 03/06/2015 12:28 PM, Shawn Heisey wrote: On 3/6/2015 10:20 AM, Patrick Durusau wrote: I was looking at the PDF version of the Apache Solr Reference Guide 5.0 and noticed that it has no TOC nor any section numbering. http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf The lack of a TOC and section headings makes navigation difficult. I have just started making suggestions on the documentation and was wondering if there is a reason why the TOC and section headings are missing? (that isn't apparent from the document) The TOC is built into the PDF and it's up to the PDF viewer to display it. Here's a screenshot of the ref guide in Adobe Reader with a clickable TOC open. https://www.dropbox.com/s/3ajuri1emj61imu/refguide-5.0-TOC.png?dl=0 Section numbering might be a good idea, if it's not too intrusive or difficult. Thanks, Shawn
RE: Delimited payloads input issue
Well, the only work-around we found to actually work properly is to override the problem causing tokenizer implementations on by one. Regarding the WordDelimiterFilter, the quickest fix is enabling keepOriginal, if you don't want the original to stick around, the filter implementation must be modified to carry the original PayloadAttribute to its descendants. Markus -Original message- From:Markus Jelsma markus.jel...@openindex.io Sent: Friday 27th February 2015 17:28 To: solr-user solr-user@lucene.apache.org Subject: Delimited payloads input issue Hi - we attempt to use payloads to identify different parts of extracted HTML pages and use the DelimitedPayloadTokenFilter to assign the correct payload to the tokens. However, we are having issues for some language analyzers and issues with some types of content for most regular analyzers. If we, for example, want to assign payloads to the text within an H1 field that contains non-alphanumerics such as `Hello, i am a heading!`, and use |5 as delimiter and payload, we send the following to Solr, `Hello,|5 i|5 am|5 a|5 heading!|5`. This is not going to work because due to a WordDelimiterFilter, the tokens Hello and heading obviously loose their payload. We also cannot put the payload between the last alphanumeric and the following comma or exlamation mark because then those characters would become part of the payload if we use identity encoder, or it should fail if we use another encoder. We could solve this using a custom encoder that only takes the first character and ignores the rest, but this seems rather ugly. On the other hand, we have issues using language specific tokenizers such as Kuromoji, i will immediately dump the delimited payload so it never reaches the DelimitedPayloadTokenFilter. And if we try chinese and have the StandardTokenizer enabled, we also loose the delimited payload. Any of you have dealt with this before? Hints to share? Many thanks, Markus
Re: Apache Solr Reference Guide 5.0
On 3/6/2015 10:20 AM, Patrick Durusau wrote: I was looking at the PDF version of the Apache Solr Reference Guide 5.0 and noticed that it has no TOC nor any section numbering. http://apache.claz.org/lucene/solr/ref-guide/apache-solr-ref-guide-5.0.pdf The lack of a TOC and section headings makes navigation difficult. I have just started making suggestions on the documentation and was wondering if there is a reason why the TOC and section headings are missing? (that isn't apparent from the document) The TOC is built into the PDF and it's up to the PDF viewer to display it. Here's a screenshot of the ref guide in Adobe Reader with a clickable TOC open. https://www.dropbox.com/s/3ajuri1emj61imu/refguide-5.0-TOC.png?dl=0 Section numbering might be a good idea, if it's not too intrusive or difficult. Thanks, Shawn
How to direct SOLR 4.9 log output to regular Tomcat logs
I want SOLR 4.9 to log to my rolling tomcat logs like catalina.2015-03-06.log. Instead I'm just getting a solr.log with no timestamp. Maybe this is this just the way it has to be now? I'm also not sure if I need to copy more SOLR jars into my tomcat lib. This is my setup. tomcat6/conf/log4j.properties log4j.rootLogger=debug, R log4j.appender.R=org.apache.log4j.RollingFileAppender log4j.appender.R.File=${catalina.home}/logs/tomcat.log log4j.appender.R.MaxFileSize=10MB log4j.appender.R.MaxBackupIndex=10 log4j.appender.R.layout=org.apache.log4j.PatternLayout log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%n log4j.logger.org.apache.catalina=DEBUG, R log4j.logger.org.apache.catalina.core.ContainerBase.[Catalina].[localhost]=DEBUG, R log4j.logger.org.apache.catalina.core=DEBUG, R log4j.logger.org.apache.catalina.session=DEBUG, R tomcat6/conf/logging.properties - handlers = 1catalina.org.apache.juli.FileHandler, 2localhost.org.apache.juli.FileHandler, 3manager.org.apache.juli.FileHandler, 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler .handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler 1catalina.org.apache.juli.FileHandler.level = FINE 1catalina.org.apache.juli.FileHandler.directory = /data/tomcatlogs 1catalina.org.apache.juli.FileHandler.prefix = catalina. 2localhost.org.apache.juli.FileHandler.level = FINE 2localhost.org.apache.juli.FileHandler.directory = /data/tomcatlogs 2localhost.org.apache.juli.FileHandler.prefix = localhost. 3manager.org.apache.juli.FileHandler.level = FINE 3manager.org.apache.juli.FileHandler.directory = /data/tomcatlogs 3manager.org.apache.juli.FileHandler.prefix = manager. 4host-manager.org.apache.juli.FileHandler.level = FINE 4host-manager.org.apache.juli.FileHandler.directory = /data/tomcatlogs 4host-manager.org.apache.juli.FileHandler.prefix = host-manager. java.util.logging.ConsoleHandler.level = FINE java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = 2localhost.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = 3manager.org.apache.juli.FileHandler org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level = INFO org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers = 4host-manager.org.apache.juli.FileHandler copied solr-4.9.0/example/lib/ext/*.jar to tomcat6/lib, not the solrj-lib + dist jars as some tutorials suggested -- jcl-over-slf4j-1.7.6.jar jul-to-slf4j-1.7.6.jar log4j-1.2.17.jar slf4j-api-1.7.6.jar slf4j-log4j12-1.7.6.jar copied ./solr-4.9.0/example/resources/log4j.properties to tomcat6/lib and pointed solr.log to my chosen directory. I also have a tomcat6/conf/log4j.properties and don't know if I should delete it. -- # Logging level solr.log=/data/tomcatlogs log4j.rootLogger=INFO, file, CONSOLE log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=%-4r [%t] %-5p %c %x \u2013 %m%n #- size rotation with log cleanup. log4j.appender.file=org.apache.log4j.RollingFileAppender log4j.appender.file.MaxFileSize=4MB log4j.appender.file.MaxBackupIndex=9 #- File to log to and log format log4j.appender.file.File=${solr.log}/solr.log log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%-5p - %d{-MM-dd HH:mm:ss.SSS}; %C; %m\n log4j.logger.org.apache.zookeeper=WARN log4j.logger.org.apache.hadoop=WARN # set to INFO to enable infostream log messages log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-direct-SOLR-4-9-log-output-to-regular-Tomcat-logs-tp4191502.html Sent from the Solr - User mailing list archive at Nabble.com.
PostBind method for DocumentObjectBinder?
Hello, DocumentObjectBinder would benefit from a post bind call imo. Something like: public T ListT getBeans(ClassT clazz, SolrDocumentList solrDocList, boolean postBind) { ListDocField fields = getDocFields(clazz); ListT result = new ArrayList(solrDocList.size()); for (SolrDocument sdoc : solrDocList) { T bean = getBean(clazz, fields, sdoc); if (postBind) { runAnnotatedMethod(bean, PostBind.class); } result.add(bean); } return result; } private void runAnnotatedMethod(final Object instance, Class? extends Annotation annotation) { for (Method m : instance.getClass().getDeclaredMethods()) { if (m.isAnnotationPresent(annotation)) { m.setAccessible(true); try { m.invoke(instance, new Object[] {}); } catch (Exception e) { throw new BindingException(Could not run postbind + instance.getClass(), e); } } } } Can probably take some thinking on how to do the API pretty staying backwards compatible and the found annotated method should be cached. WDYT?
Re: SolrCloud default shard assignment order not correct
On 3/2/2015 2:12 PM, spillane wrote: Since the order is consistently 1,4,2,3 it sounds like I can start the leaders in 1,4,2,3 order and then replicas in 1,4,2,3 order and expect the relationships to stick leader1 - replica1 leader4 - replica4 leader2 - replica2 leader3 - replica3 In Solr 5.0, the cloud graph is sorting the collections by name. The shard names also appear to be sorted -- all the collections I have on the example cloud setup only have two shards, so I really can't be sure. It might also be sorting the replicas within each shard. I looked for an issue so I would know what version first included the sort, but I could not find one. I only know that 4.2 does not have the sort, and 5.0 does. Thanks, Shawn
Re: Labels for facets on Velocity
you can write a macro in you velocity template,which is used to show your query response 2015-03-06 1:14 GMT+08:00 Henrique O. Santos hensan...@gmail.com: Hello, I’ve been trying to have a pretty name for my facets on Velocity Response Writer. Do you know how can I do that? For example, suppose that I am faceting field1. My query returns 3 facets: uglyfacet1, uglyfacet2 and uglyfacet3. I want to show them to the user a pretty name, like Pretty Facet 1, Pretty Facet 2 and Pretty Facet 3. The thing is that linking on velocity should still work, so the user can navigate the results. Thank you. Henrique.