Re: SolrCloud result correctness compared with single core

2015-01-29 Thread Yandong Yao
say, an optimize. > > So called "bottom line" is that yes, the scoring may change, but > IMO not any more radically than was possible with single cores, > and I wouldn't worry about unless I had evidence that it was > biting me. > > Best > Erick > > On

SolrCloud result correctness compared with single core

2015-01-23 Thread Yandong Yao
Hi Guys, As the main scoring mechanism is based tf/idf, so will same query running against SolrCloud return different result against running it against single core with same data sets as idf will only count df inside one core? eg: Assume I have 100GB data: A) Index those data using single core B)

Re: Index optimize takes more than 40 minutes for 18M documents

2013-02-21 Thread Yandong Yao
and have no updates in between. But even then it may be a > waste of time. > > You need lots of free disk space for merging, whether a forced merge or > automatic. Free space equal to the size of the index is usually enough, but > worst case can need double the size of the index.

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
out. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Jan 9, 2013 6:07 PM, "Yandong Yao" wrote: > > > Any comments on this? Thanks very much in advance! > > > > 2013/1/9 Yandong Yao > > > > > Hi Solr Guru, > > >

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
Any comments on this? Thanks very much in advance! 2013/1/9 Yandong Yao > Hi Solr Guru, > > I have two set of documents in one SolrCore, each set has about 1M > documents with different document type, say 'type1' and 'type2'. > > Many documents in

How to run many MoreLikeThis request efficiently?

2013-01-08 Thread Yandong Yao
Hi Solr Guru, I have two set of documents in one SolrCore, each set has about 1M documents with different document type, say 'type1' and 'type2'. Many documents in first set are very similar with 1 or 2 documents in the second set, What I want to get is: for each document in set 2, return the mo

Re: mergeindex: what happens if there is deletion during index merging

2012-08-21 Thread Yandong Yao
Hi Shalin, Thanks very much for your detailed explanation! Regards, Yandong 2012/8/21 Shalin Shekhar Mangar > On Tue, Aug 21, 2012 at 8:47 AM, Yandong Yao wrote: > > > Hi guys, > > > > From http://wiki.apache.org/solr/MergingSolrIndexes, it said 'Using > >

mergeindex: what happens if there is deletion during index merging

2012-08-20 Thread Yandong Yao
Hi guys, >From http://wiki.apache.org/solr/MergingSolrIndexes, it said 'Using "srcCore", care is taken to ensure that the merged index is not corrupted even if writes are happening in parallel on the source index'. What does it means? If there are deletion request during merging, will this delet

Count is inconsistent between facet and stats

2012-07-18 Thread Yandong Yao
Hi Guys, Steps to reproduce: 1) Download apache-solr-4.0.0-ALPHA 2) cd example; java -jar start.jar 3) cd exampledocs; ./post.sh *.xml 4) Use statsComponent to get the stats info for field 'popularity' based on facet 'cat'. And the 'count' for 'electronics' is 3 http://localhost:8983/solr/coll

Re: SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-23 Thread Yandong Yao
rks against the automation in > solrcore, but maybe there's a good reason you want to do it this way. > > > > --- Original Message --- > > On 5/22/2012 07:35 AM Yandong Yao wrote:Hi Darren, > > > > Thanks very much for your reply. > > > > The reason I wa

Re: SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-22 Thread Yandong Yao
or you, therefore when you try to search a node/core > with no documents, all the results from the "cloud" are retrieved > regardless. This is considered "A Good Thing". > > It requires a change in thinking about indexing and searching > > On Tue, 2012-05

SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-21 Thread Yandong Yao
Hi Guys, I use following command to start solr cloud according to solr cloud wiki. yydzero:example bjcoe$ java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar yydzero:example2 bjcoe$ java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

Re: Faster Solr Indexing

2012-03-11 Thread Yandong Yao
I have similar issues by using DIH, and org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) consumes most of the time when indexing 10K rows (each row is about 70K) - DIH nextRow takes about 10 seconds totally - If index uses whitespace tokenizer and lower case filter, th

How to use nested query in fq?

2012-02-07 Thread Yandong Yao
Hi Guys, I am using Solr 3.5, and would like to use a fq like 'getField(getDoc(uuid:workspace_${workspaceId})), "isPublic"):true? - workspace_${workspaceId}: workspaceId is indexed field. - getDoc(uuid:concat("workspace_", workspaceId): return the document whose uuid is "workspace_${workspaceI

Re: Need help for solr searching case insensative item

2010-10-26 Thread yandong yao
Sounds like WordDelimiterFilter config issue, please refer to http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory . Also it will help if you could provide: 1) Tokenizers/Filters config in schema file 2) analysis.jsp output in admin page. 2010/10/26 wu liu

Re: A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
After upgrading to 1.4.1, it is fixed. Thanks very much for your help! Regards, Yandong Yao 2010/9/14 yandong yao > Hi Robert, > > I am using solr 1.4, will try with 1.4.1 tomorrow. > > Thanks very much! > > Regards, > Yandong Yao > > 2010/9/14 Robert Muir >

Re: A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
Hi Robert, I am using solr 1.4, will try with 1.4.1 tomorrow. Thanks very much! Regards, Yandong Yao 2010/9/14 Robert Muir > did you index with solr 1.4 (or are you using solr 1.4) ? > > at a quick glance, it looks like it might be this: > https://issues.apache.org/jira/brow

A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
Hi Guys, I encountered a problem when enabling WordDelimiterFilterFactory for both index and query (pasted relative part of schema.xml at the bottom of email). *1. Steps to reproduce:* 1.1 The indexed sample document contains only one sentence: "This is a TechNote." 1.2 Query is: q=TechNo

Re: how to support "implicit trailing wildcards"

2010-08-11 Thread yandong yao
> you could satisfy this by making 2 fields: > > 1. exactmatch > > 2. wildcardmatch > > > > use copyfield in your schema to copy 1 --> 2 . > > > > q=exactmatch:mount+wildcardmatch:mount*&q.op=OR > > this would score exact matches above (solely) wildcar

Re: how to support "implicit trailing wildcards"

2010-08-09 Thread yandong yao
rationale is that if search 'mounted', I also want documents with 'mount' match. So seems built-in wildcard search could not satisfy my requirements if i understand correctly. Thanks very much! 2010/8/9 Bastian Spitzer > Wildcard-Search is already built in, just use:

how to support "implicit trailing wildcards"

2010-08-09 Thread yandong yao
Hi everyone, How to support 'implicit trailing wildcard *' using Solr, eg: using Google to search 'umoun', 'umount' will be matched , search 'mounta', 'mountain' will be matched. >From my point of view, there are several ways, both with disadvantages: 1) Using EdgeNGramFilterFactory, thus 'umou