RE: Index database with SolrJ using xml file directly throws an error

2019-03-01 Thread Dyer, James
Instead of dataConfig=data-config.xml, use config=data-config.xml . From: sami Sent: Friday, March 1, 2019 3:05 AM To: solr-user@lucene.apache.org Subject: RE: Index database with SolrJ using xml file directly throws an error Hi James, Thanks for your reply. I am not absolotuely sure I

RE: Index database with SolrJ using xml file directly throws an error

2019-02-28 Thread Dyer, James
The parameter "dataConfig" should hold an actual xml document to override the data-config.xml file you store in zookeeper (cloud) or the configuration directory (standalone). Typically you do not use this parameter. Instead, specify the "config" parameter with the filename (eg.

RE: [External] Setting Spellcheck for solr only for zero result

2018-09-26 Thread Dyer, James
Neel, I do not think there is a way to entirely bypass spellchecking if there are results returned, and I'm not so sure performance would noticeably improve if it did this. Clients can easily check to see if results were returned and can ignore the spellcheck response in these cases, if

RE: [External] [Solr 7.1.0] spellcheck.maxCollationTries > 0 no results

2018-08-09 Thread Dyer, James
It doesn't appear to me that the collator works with "spellcheck.q". Looking at the unit test (SpellCheckCollatorTest.java), this is not a use-case that is being tested. I opened https://issues.apache.org/jira/browse/SOLR-12650 to track this bug. As a workaround, you can remove

RE: Error configuring Spell Checker

2018-04-17 Thread Dyer, James
(moving to solr-user@lucene.apache.org) Gene, I can reproduce your problem if I misspell the "spellcheck.dictionary" parameter in my query. But I see your query has "direct" which matches the "name" element of one of your spellcheckers. I think the actual problem in your case might be that

RE: StringIndexOutOfBoundsException "in" SpellCheckCollator.getCollation

2017-01-17 Thread Dyer, James
This sounds a lot like SOLR-4489. However it looks like this was fixed prior to you version (4.5). So it could be you found another case where this bug still exists. The other thing is the default Query Converter cannot handle all cases, and it could be the query you are sending is beyond

RE: Can't get spelling suggestions to work properly

2017-01-17 Thread Dyer, James
Jimi, Generally speaking, spellcheck does not work well against fields with stemming, or other "heavy" analysis. I would to a field that is tokenized on whitespace with little else, and use that field for spellcheck. By default, the spellchecker does not suggest for words in the index. So

RE: CachedSqlEntityProcessor with delta-import

2016-10-21 Thread Dyer, James
Sowmya, My memory is that the cache feature does not work with Delta Imports. In fact, I believe that nearly all DIH features except straight JDBC imports do not work with Delta Imports. My advice is to not use the Delta Import feature at all as the same result can (often more-efficiently)

RE: Solr 4.3.1 - Spell-Checker with MULTI-WORD PHRASE

2016-07-29 Thread Dyer, James
You need to set the "spellcheck.maxCollationTries" parameter to a value greater than zero. The higher the value, the more queries it checks for hits, and the longer it could potentially take. See

RE: using spell check on phrases

2016-06-10 Thread Dyer, James
Kaveh, If your query has "mm" set to zero or a low value, then you may want to override this when the spellchecker checks possible collations. For example: spellcheck.collateParam.mm=100% You may also want to consider adding "spellcheck.maxResultsForSuggest" to your query, so that it will

RE: How get around solr's spellcheck maxEdit limit of 2?

2016-01-22 Thread Dyer, James
of that. It will help me.. Thanks On Fri, Jan 22, 2016 at 1:45 AM Dyer, James <james.d...@ingramcontent.com> wrote: > But if you really need more than 2 edits, I think IndexBasedSpellChecker > supports it. > > James Dyer > Ingram Content Group > > -Original Message-

RE: How get around solr's spellcheck maxEdit limit of 2?

2016-01-21 Thread Dyer, James
But if you really need more than 2 edits, I think IndexBasedSpellChecker supports it. James Dyer Ingram Content Group -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, January 21, 2016 11:29 AM To: solr-user Subject: Re: How get around solr's

RE: Spellcheck response format differs between a single core and SolrCloud

2016-01-11 Thread Dyer, James
Ryan, The json response format changed for Solr 5.0. See https://issues.apache.org/jira/browse/SOLR-3029 . Is the single-core solr running a 4.x version with the cloud solr running 5.x ? If they are both on the same major version, then we have a bug. James Dyer Ingram Content Group

RE: DIH Caching w/ BerkleyBackedCache

2015-12-16 Thread Dyer, James
Content Group -Original Message- From: Todd Long [mailto:lon...@gmail.com] Sent: Wednesday, December 16, 2015 10:21 AM To: solr-user@lucene.apache.org Subject: RE: DIH Caching w/ BerkleyBackedCache James, I apologize for the late response. Dyer, James-2 wrote > With the DIH requ

RE: Data Import Handler - Multivalued fields - splitBy

2015-12-04 Thread Dyer, James
Brian, Be sure to have... transformer="RegexTransformer" ...in your tag. It’s the RegexTransformer class that looks for "splitBy". See https://wiki.apache.org/solr/DataImportHandler#RegexTransformer for more information. James Dyer Ingram Content Group -Original Message- From:

RE: Spellcheck error

2015-12-03 Thread Dyer, James
Matt, Can you give some information about how your spellcheck field is analyzed and also if you're using a custom query converter. Also, try and place the bare terms you want checked in spellcheck.q (ex, if your query is q=+movie +theatre, then spellcheck.q=movie theatre). Does it work in

RE: DIH Caching w/ BerkleyBackedCache

2015-11-20 Thread Dyer, James
Todd, With the DIH request, are you specifying "cacheDeletePriorData=false". Looking at the BerkleyBackedCache code if this is set to true, it deletes the cache and assumes the current update is to fully repopulate it. If you want to do an incremental update to the cache, it needs to be

RE: DIH Caching with Delta Import

2015-10-21 Thread Dyer, James
The DIH Cache feature does not work with delta import. Actually, much of DIH does not work with delta import. The workaround you describe is similar to the approach described here: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport , which in my opinion is the best way to

RE: DIH parallel processing

2015-10-15 Thread Dyer, James
Nabil, What we do is have multiple dih request handlers configured in solrconfig.xml. Then in the sql query we put something like "where mod(id, ${partition})=0". Then an external script calls a full import on each request handler at the same time and monitors the response. This isn't the

RE: File-based Spelling

2015-10-13 Thread Dyer, James
Mark, The older spellcheck implementations create an n-gram sidecar index, which is why you're seeing your name split into 2-grams like this. See the IR Book by Manning et al, section 3.3.4 for more information. Based on the results you're getting, I think it is loading your file correctly.

RE: Spell Check and Privacy

2015-10-12 Thread Dyer, James
Arnon, Use "spellcheck.collate=true" with "spellcheck.maxCollationTries" set to a non-zero value. This will give you re-written queries that are guaranteed to return hits, given the original query and filters. If you are using an "mm" value other than 100%, you also will want specify

RE: String index out of range exception from Spell check

2015-09-28 Thread Dyer, James
This looks similar to SOLR-4489, which is marked fixed for version 4.5. If you're using an older version, the fix is to upgrade. Also see SOLR-3608, which is similar but here it seems as if the user's query is more than spellcheck was designed to handle. This should still be looked at and

RE: Spellcheck / Suggestions : Append custom dictionary to SOLR default index

2015-08-25 Thread Dyer, James
Max, If you know the entire list of words you want to spellcheck against, you can use FileBasedSpellChecker. See http://wiki.apache.org/solr/FileBasedSpellChecker . If, however, you have a field you want to spellcheck against but also want additional words added, consider using a copy of the

RE: exclude folder in dataimport handler.

2015-08-20 Thread Dyer, James
I took a quick look at FileListEntityProcessor#init, and it looks like it applies the excludes regex to the filename element of the path only, and not to the directories. If your filenames do not have a naming convention that would let you use it this way, you might be able to write a

RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha, Possibly this english-specific analysis in your text_suggest field is interfering: solr.EnglishPossessiveFilterFactory ? Another guess is you're receiving more than 5 results and maxResultsForSuggest is set to 5. But I'm not sure. Maybe someone can help with more information from

RE: Solr spell check not showing any suggestions for other language

2015-08-05 Thread Dyer, James
Talha, Can you try putting your queried keyword in spellcheck.q ? James Dyer Ingram Content Group -Original Message- From: talha [mailto:talh...@gmail.com] Sent: Wednesday, August 05, 2015 10:13 AM To: solr-user@lucene.apache.org Subject: RE: Solr spell check not showing any

RE: Solr spell check mutliwords

2015-07-30 Thread Dyer, James
Talha, In your configuration, you have this set: str name=spellcheck.maxResultsForSuggest5/str ...which means it will consider the query correctly spelled and offer no suggestions if there are 5 or more results. You could omit this parameter and it will always suggest when possible.

RE: Protwords in solr spellchecker

2015-07-10 Thread Dyer, James
Kamal, Given the constraint that you cannot re-index the data, your best bet might be to simply filter out the suggestions at the application level, or maybe even have a proxy do it. Possibly another option, you might be able to extend DirectSolrSpellchecker and override #getSuggestions(),

RE: Spell checking the synonym list?

2015-07-09 Thread Dyer, James
Ryan, If you use index-time synonyms on the spellcheck field, this will give you what you want. For instance, if the document has lawyer and you index both terms lawyer,attorney, then the spellchecker will see that atorney is 1 edit away from an indexed term and will suggest attorney.

RE: using DirectSpellChecker and FileBasedSpellChecker with Solr 4.10.1

2015-04-14 Thread Dyer, James
Elisabeth, Currently ConjunctionSolrSpellChecker only supports adding WordBreakSolrSpellchecker to IndexBased- FileBased- or DirectSolrSpellChecker. In the future, it would be great if it could handle other Spell Checker combinations. For instance, if you had a (e)dismax query that searches

RE: Solr phonetics with spelling

2015-03-10 Thread Dyer, James
Ashish, I would not recommend using spellcheck against a phonetic-analyzed field. Instead, you can use copyField to create a separate field that is lightly analyzed and use the copy for spelling. James Dyer Ingram Content Group -Original Message- From: Ashish Mukherjee

RE: spellcheck.count v/s spellcheck.alternativeTermCount

2015-02-18 Thread Dyer, James
up to 10 suggestions for hope, but only up to 5 suggestions for life. * On Wed, Feb 18, 2015 at 1:10 AM, Dyer, James james.d...@ingramcontent.com wrote: Here is an example to illustrate what I mean... - query q=text:(life AND hope)spellcheck.count=10spellcheck.alternativeTermCount=5 - suppose

RE: Why collations are coming even I set the value of spellcheck.count to zero(0)

2015-02-18 Thread Dyer, James
I think when you set count/alternativeTermCount to zero, the defaults (10?) are used instead. Instead of setting these to zero, just use spellcheck=false. These 2 parameters control suggestions, not collations. To turn off collations, set spellcheck.collate=false. Also, I wouldn't set

RE: spellcheck.count v/s spellcheck.alternativeTermCount

2015-02-17 Thread Dyer, James
See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.count and the following section, for details. Briefly, count is the # of suggestions it will return for terms that are *not* in your index/dictionary. alternativeTermCount are the # of alternatives you want returned for terms that

RE: spellcheck.count v/s spellcheck.alternativeTermCount

2015-02-17 Thread Dyer, James
: spellcheck.count v/s spellcheck.alternativeTermCount Hi James, How can you say that count doesn't use index/dictionary then from where suggestions come. On Tue, Feb 17, 2015 at 10:29 PM, Dyer, James james.d...@ingramcontent.com wrote: See http://wiki.apache.org/solr

RE: Collations are not working fine.

2015-02-13 Thread Dyer, James
Nitin, Can you post the full spellcheck response when you query: q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell James Dyer Ingram Content Group -Original Message- From: Nitin Solanki [mailto:nitinml...@gmail.com] Sent: Friday, February 13, 2015 1:05 AM To:

RE: Collations are not working fine.

2015-02-10 Thread Dyer, James
maxShingleSize=5 minShingleSize=2 outputUnigrams=true/ /analyzer /fieldType On Tue, Feb 10, 2015 at 1:23 AM, Dyer, James james.d...@ingramcontent.com wrote: Nitin, My guess here is that your spellcheck field is a field that has stemming. This might be why you get a collation that return wind

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Okke, My first guess is that the additional results from the word break spellchecker is causing additional per-term results and the correct answer is not making the list. So you might need to increase spellcheck.count and/or spellcheck.alternativeTermCount . My second guess is that the

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Okke, There is no way to have it both correct spelling and whitespace in the same correction. So unfortunately there is no easy fix for your use-case. The old shingle method of correcting whitespace might work for this, but it might also introduce other problems. I saw your comments on

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
I think the problem is when it combines suggestions from DirectSolrSpellChecker and WorkBreakSolrSpellChecker, it gets two lists of possiblities in edit distance order. And when it combines these lists, all it does is interleave the 2 lists: 1 from the first list, then 1 from the 2nd list,

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Got it. Took a quick look at the code and I see it uses the maximum frequency of the terms. And in your case, one of these terms (holy and wood), occurs 71,000 times. It wouldn't be too difficult to change this to use the average frequency of the terms or the minimum. But currently the only

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
I opened LUCENE-6237 for this. I can't promise when I or someone else will actually complete this, but it wouldn't be very difficult to do either. Seeing your use-case, I think this would be a nice little improvement. James Dyer Ingram Content Group -Original Message- From: O. Klein

RE: Collations are not working fine.

2015-02-09 Thread Dyer, James
Nitin, My guess here is that your spellcheck field is a field that has stemming. This might be why you get a collation that return wind even though the user queried wnd and it does not get any suggestions. Perhaps wnd is stemmed the same as wind ? (Spellcheck usually works best if you

RE: Solr 4.9 Calling DIH concurrently

2015-02-04 Thread Dyer, James
Yes, that is what I mean. In my case, for each /dataimport in the defaults section, I also put something like this: str name=currentPartition1/str ...and then reference it in the data-config.xml with ${dataimporter.request.currentPartition} . This way the same data-config.xml can be used

RE: Solr 4.9 Calling DIH concurrently

2015-02-03 Thread Dyer, James
DIH is single-threaded. There was once a threaded option, but it was buggy and subsequently was removed. What I do is partition my data and run multiple dih request handlers at the same time. It means redundant sections in solrconfig.xml and its not very elegant but it works. For

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-02-02 Thread Dyer, James
suggesting I phone for users who searched iphone. Minbreaklength of 1 is just too small isn't it? Il sabato 31 gennaio 2015, Dyer, James-2 [via Lucene] ml-node+s472066n4183176...@n3.nabble.com ha scritto: You need to decrease this to at least 2 because the length of go is 3. int name=minBreakLength3

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-30 Thread Dyer, James
You need to decrease this to at least 2 because the length of go is 3. int name=minBreakLength3/int James Dyer Ingram Content Group -Original Message- From: fabio.bozzo [mailto:f.bo...@3-w.it] Sent: Wednesday, January 28, 2015 4:55 PM To: solr-user@lucene.apache.org Subject: RE:

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-28 Thread Dyer, James
Try using something larger than 2 for alternativeTermCount. 5 is probably ok here. If that doesn't work, then post the exact query you are using and the full extended spellcheck results. James Dyer Ingram Content Group -Original Message- From: fabio.bozzo [mailto:f.bo...@3-w.it]

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-27 Thread Dyer, James
and 150 documents containing gopro. Suggestions of the other term do not come up in any case. 2015-01-27 16:21 GMT+01:00 Dyer, James-2 [via Lucene] ml-node+s472066n4182254...@n3.nabble.com: I think the word break spellchecker will do what you want. But, if I were you, I'd dial back maxChanges

RE: SpellingQueryConverter and query parsing

2015-01-27 Thread Dyer, James
Having worked with the spellchecking code for the last few years, I've often wondered the same thing, but I never looked seriously into it. I'm sure there's probably some serious hurdles, hence the Query Converter. The easy thing to do here is to use spellcheck.q, and then pass in

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-27 Thread Dyer, James
I think the word break spellchecker will do what you want. But, if I were you, I'd dial back maxChanges to 1 or 2. You don't want it slicing a word into 10 parts or trying to combine 10 adjacent words. You also need the minBreakLength to be no more than 2, if you want it to break go

RE: Stop word suggestions are coming when I indexed sentence using ShingleFilterFactory

2015-01-27 Thread Dyer, James
Can you give a little more information as to how you have the spellchecker configured in solrsonfig.xml? Also, it would help if you showed a query and the spell check response and then explain what you wanted it to return vs what it actually returned. My guess is that the stop words you

RE: can't make sense of spellchecker results when using techproducts example

2015-01-09 Thread Dyer, James
Chris, - DirectSpellChecker has a setting for minPrefix which the techproducts example sets to 1 (also the default). So it will never try to correct the first character. I think this is both a performance optimization and is based on the assumption that we rarely misspell the first

RE: Spellchecker delivers far too few suggestions

2014-12-18 Thread Dyer, James
Martin, If you would like to get suggestions even for terms occurring in the index, set spellcheck.alternativeTermCount to a value 0 . You can use the same value as for spellcheck.count, or a lower value if you want fewer results than for terms not in the index. See

RE: Multiword mispellings

2014-12-18 Thread Dyer, James
Matt, Unfortunately this kind of correction is not supported. The word break spell checker works independently from the distance-based spellcheckers so it cannot correct both whitespace problems and other misspellings together. If you really need this, then you'll need to go with the

RE: WordBreakSolrSpellChecker Usage

2014-12-16 Thread Dyer, James
://gist.github.com/halogenandtoast/76fd5dcfae1c4edeba30 On Thu, Dec 11, 2014 at 1:19 PM, Dyer, James james.d...@ingramcontent.com wrote: Matt, There is no exact number here, but I would think most people would want count to be maybe 10-20. Increasing this incurs a very small performance penalty for each

RE: WordBreakSolrSpellChecker Usage

2014-12-11 Thread Dyer, James
My first guess here, is seeing it works some of the time but not others, is that these values are too low: str name=spellcheck.maxCollationTries5/str str name=spellcheck.count5/str You know spellcheck.count is too low if the suggestion you want is not in the suggestions part of the response,

RE: WordBreakSolrSpellChecker Usage

2014-12-11 Thread Dyer, James
Subject: Re: WordBreakSolrSpellChecker Usage Is there a suggested value for this. I bumped them up to 20 and still nothing has seemed to change. On Thu, Dec 11, 2014 at 9:42 AM, Dyer, James james.d...@ingramcontent.com wrote: My first guess here, is seeing it works some of the time

RE: Word Break Spell Checker Implementation algorithm

2014-10-21 Thread Dyer, James
David, I do not know of a published algorithm for this. All it does is in the case of terms with 0 frequency, it checks the document frequency of the various parts that can be made from the terms by breaking them and/or by combining adjacent terms. There are tuning parameters available that

RE: Data Import Handler for CSV file

2014-10-10 Thread Dyer, James
Nabil, Unfortunately, the out-of-the box functionality for DIH lacks a lot of what the csv handler has to offer. There is a LineEntityProcessor (see http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor), but this will just output each line in a field called rawLine. It is up to

RE: DIH - cacheImpl=SortedMapBackedCache - empty rows from sub entity

2014-10-02 Thread Dyer, James
Try using the cacheKey/cacheLookup parameters instead: entity name=en1 pk=id transformer=DateFormatTransformer query=SELECT id, product FROM table WHERE product = 'abc' entity name=en2 cacheKey=id cacheLookup=en1.id

RE: Spellchecking and suggesting part numbers

2014-09-24 Thread Dyer, James
Alexander, You could use a higher value for spellcheck.count, maybe 20 or so, then in your application pick out the suggestions that make changes on the right side. Another option is to use DirectSolrSpellChecker (usually a better choice anyhow) and set the minPrefix field. This will require

RE: fuzzy terms, DirectSolrSpellChecker and alternativeTermCount

2014-09-22 Thread Dyer, James
Nathaniel, Can you show us all of the parameters you are sending to the spellchecker? When you specify alternativeTermCount with spellcheck.q=quidam, what are the terms you expect to get back? Also, are you getting any query results back? If you are using a q that returns results, or more

RE: fuzzy terms, DirectSolrSpellChecker and alternativeTermCount

2014-09-22 Thread Dyer, James
=defaults str name=spellcheck.dictionaryfuzzy1/str str name=spellcheck.count20/str int name=spellcheck.alternativeTermCount100/int /lst arr name=last-components strfuzzyterms/str /arr /requestHandler Thanks! Nathaniel On Mon, Sep 22, 2014 at 4:08 , Dyer

RE: fuzzy terms, DirectSolrSpellChecker and alternativeTermCount

2014-09-22 Thread Dyer, James
for small indexes when in fact I need a high value like 0.99, so every term returns suggestions. (Is it possible to set it to 100%? Because 1 gets interpreted as an absolute value.) Nathaniel On Mon, Sep 22, 2014 at 6:17 , Dyer, James james.d...@ingramcontent.com wrote: DirectSpellChecker

RE: Solr Spellcheck suggestions only return from /select handler when returning search results

2014-09-11 Thread Dyer, James
. I get suggestions when I use Transpor and Transpo, even Transpotr, but ransport doesn't yield any suggestions. Maybe it's a question of the beginning of a word and has not really anything to do with stemming. Am 10.09.2014 15:19 schrieb Dyer, James: Thomas, It looks like you've set things up

RE: Solr Spellcheck suggestions only return from /select handler when returning search results

2014-09-10 Thread Dyer, James
Thomas, It looks like you've set things up correctly in that while the user is searching against a stemmed field (name), spellcheck is checking against a lightly-analyzed copy of it (spell). This is the right way to do it as spellcheck against stemmed forms is usually undesirable. But as

RE: Solr spellcheck returns more than 1 word for a 1 word spellcheck

2014-09-02 Thread Dyer, James
This is the WordBreakSolrSpellChecker, which is there to correct spelling errors involving misplaced whitespace (or is it white space ??) To disable it, remove this or similar line from your requestHandler in solrconfig.xml: str name=spellcheck.dictionarywordbreak/str Keep in mind, if you

RE: Spellchecking suggestions won't collate

2014-08-20 Thread Dyer, James
Because my is the 7th suggestion down the list, it is going to need more than 30 tries to figure out the one that can give some hits. You can increase maxCollationTries if you're willing to endure the performance penalty of trying so many replacement queries. This case actually highlights why

RE: Spell check collation

2014-08-14 Thread Dyer, James
DirectSolrSpellChecker defaults with a minimum term length of 4. So you'd need to bring this down with int name=minQueryLength1/int. But you might not like the results from this. See:

RE: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-12 Thread Dyer, James
it gets stuck sometimes I have to restart the server, sometimes I'm able to edit the solrconfig.xml and reload it. Harun Reşit Zafer TÜBİTAK BİLGEM BTE Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü T +90 262 675 3268 W http://www.hrzafer.com On 11.08.2014 17:32, Dyer, James wrote

RE: SqlEntityProcessor

2014-08-11 Thread Dyer, James
I've heard of a user adding a separate entity / section to the end of their data-config.xml with a SqlEntityProcessor and an UPDATE statement. It would run after your main entity / section. I have not tried it myself, and surely DIH was not designed to do this, but it might work. A better

RE: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-11 Thread Dyer, James
Harun, Just to clarify, is this happening during startup when a warmup query is running, or is this once the server is fully started? This might be another instance of https://issues.apache.org/jira/browse/SOLR-5386 . James Dyer Ingram Content Group (615) 213-4311 -Original Message-

RE: Data Import handler and join select

2014-08-07 Thread Dyer, James
Alejandro, You can use a sub-entity with a cache using DIH. This will solve the n+1-select problem and make it run quickly. Unfortunately, the only built-in cache implementation is in-memory so it doesn't scale. There is a fast, disk-backed cache using bdb-je, which I use in production.

RE: Change order of spell checker suggestions issue

2014-08-07 Thread Dyer, James
Corey, Looking more carefully at your responses than I did last time I answered this question, it looks like every correction is 2 edits in this example. unie unity (et , insert y) unie unger (ig , insert r) unie unick (ec , insert k) unie united (delete t , insert d) unie unique (delete

RE: Debug DirectSolrSpellChecker Suggestion Sort Order

2014-08-01 Thread Dyer, James
Query results default to score. But spelling suggestions sort by edit distance, with frequency as a secondary sort. unie = unger = 2 edits unie = unick = 2 edits unie = united = 3 edits unie = unique = 3 edits ... etc ... James Dyer Ingram Content Group (615) 213-4311 -Original

RE: Searching words with spaces for word without spaces in solr

2014-07-31 Thread Dyer, James
@lucene.apache.org Subject: Re: Searching words with spaces for word without spaces in solr I am not clear with this. This link is related to spell check. Can you elaborate it more ? On Wed, Jul 30, 2014 at 9:17 PM, Dyer, James james.d...@ingramcontent.com wrote: In addition to the analyzer

RE: questions on Solr WordBreakSolrSpellChecker and WordDelimiterFilterFactory

2014-07-16 Thread Dyer, James
Jia, I agree that for the spellcheckers to work, you need arr name=last-components instead of arr name=components. But the x-box = xbox example ought to be solved by analyzing using WordDelimiterFilterFactory and catenateWords=1 at query-time. Did you re-index after changing your analysis

RE: Endeca to Solr Migration

2014-07-02 Thread Dyer, James
We migrated a big application from Endeca (6.0, I think) a several years ago. We were not using any of the business UI tools, but we found that Solr is a lot more flexible and performant than Endeca. But with more flexibility comes more you need to know. The hardest thing was to migrate the

RE: Spell checker - limit on number of misspelt words in a search term.

2014-06-23 Thread Dyer, James
I do not believe there is such a setting. Most likely you will need to increase the value for maxCollationTries to get it to discover the correct combination. Just be sure not to set this too high as queries with a lot of misspelled words (or for something your index simply doesn't have) will

RE: Solr spellcheck - onlyMorePopular threshold?

2014-06-09 Thread Dyer, James
I believe it will return the terms that are most similar to the queried terms but have a greater term frequency than the queried terms. It doesn't actually care what the term frequencies are, only that they are greater than the frequencies of the terms you queried on. I do not know your use

RE: DirectSpellChecker not returning expected suggestions.

2014-06-02 Thread Dyer, James
If wrangle is not in your index, and if it is within the max # of edits, then it should suggest it. Are you getting anything back from spellcheck at all? What is the exact query you are using? How is the spellcheck field analyzed? If you're using stemming, then wrangle and wrangler might be

RE: Wordbreak spellchecker excessive breaking.

2014-05-30 Thread Dyer, James
instance would not even start, can you let me know why ? Thanks. On Tue, May 27, 2014 at 12:21 PM, Dyer, James james.d...@ingramcontent.com wrote: You can do this if you set it up like in the mail Solr example: lst name=spellchecker str name=namewordbreak/str str name

RE: Wordbreak spellchecker excessive breaking.

2014-05-27 Thread Dyer, James
You can do this if you set it up like in the mail Solr example: lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldname/str str name=combineWordstrue/str str name=breakWordstrue/str

RE: solr 4.2.1 spellcheck strange results

2014-05-16 Thread Dyer, James
To achieve what you want, you need to specify a lightly analyzed field (no stemming) for spellcheck. For instance, if your solr.SpellCheckComponent in solrconfig.xml is set up with field of title_full, then try using title_full_unstemmed. Also, if you are specifying a queryAnalyzerFieldType,

RE: Spell check [or] Did you mean this with Phrase suggestion

2014-05-16 Thread Dyer, James
Have you looked at spellcheck.collate, which re-writes the entire query with one or more corrected words? See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate . There are several options shown at this link that controls how the collate feature works. James Dyer Ingram

RE: spellcheck if docsfound below threshold

2014-05-16 Thread Dyer, James
Its spellcheck.maxResultsForSuggest. http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxResultsForSuggest James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jan Verweij - Reeleez [mailto:j...@reeleez.nl] Sent: Monday, May 12, 2014 2:12 AM To:

RE: spellcheck.q and local parameters

2014-04-28 Thread Dyer, James
spellcheck.q is supposed to take a list of raw query terms, so what you're trying to do in your example won't work. What you should do instead is space-delimit the actual query terms that exist in qq and (nothing else) use that for your value of spellcheck.q . James Dyer Ingram Content

RE: Volatile spellcheck index

2014-02-05 Thread Dyer, James
Alejandro, Assuming you're using Solr 3.x, under: searchComponent name=spellcheck class=solr.SpellCheckComponent lst name=spellchecker ... /lst /searchComponent ...you can add: str name=spellcheckIndexDir./spellchecker/str ...then the spell check index will be created on-disk and not in

RE: How to override rollback behavior in DIH

2014-01-17 Thread Dyer, James
Peter, I think you can override org.apache.solr.handler.dataimport.SolrWriter to have a custom (no-op) rollback method. Your new writer should implement org.apache.solr.handler.dataimport.DIHWriter. You can specify the writerImpl request parameter to specify the new class. Unfortunately, it

RE: Spellchecking problem

2013-12-20 Thread Dyer, James
If you are using spellcheck.maxCollateTries with a value greater than 0 the *collatation* section of your spellcheck response will give query corrections that are proven to produce hits. Possibly you were looking at the first section where it gives individual word suggestions? Or maybe one of

RE: Spellchecking problem

2013-12-20 Thread Dyer, James
=spellcheck.collatetrue/str float name=thresholdTokenFrequency.001/float /lst /searchComponent Thanks 2013/12/20 Dyer, James james.d...@ingramcontent.com If you are using spellcheck.maxCollateTries with a value greater than 0 the *collatation* section of your spellcheck response will give

RE: DataImport Handler, writing a new EntityProcessor

2013-12-18 Thread Dyer, James
The first thing I would suggest is to try and run it not in debug mode. DIH's debug mode limits the number of documents it will take in, so that might be all that is wrong here. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: mathias@gmail.com

RE: SOLR DIH - Sub Entity with different datasource not working

2013-12-13 Thread Dyer, James
Without more of the stacktrace I don't think you'll get much help. However, its my experience that exceptions that begin with Unable to execute query mean the db didn't like something about one or both queries. I think it would have listed in there somewhere the actual query it didn't like,

RE: Data Import Handler

2013-11-13 Thread Dyer, James
=${dataimporter.request.url} and all where to mention these my purpose is to config my DB Details(url,uname,password) in properties file -Original Message- From: Dyer, James [mailto:james.d...@ingramcontent.com] Sent: Wednesday, November 06, 2013 7:42 PM To: solr-user@lucene.apache.org Subject

RE: [Spellcheck] NullPointerException on QueryComponent.mergeIds

2013-11-12 Thread Dyer, James
Jean-Marc, This might not solve the particular problem you're having, but to get spellcheck to work properly in a distributed enviornment, be sure to set the shards.qt parameter to the name of your request handler. See http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support

RE: spellcheck solr 4.3.1

2013-11-11 Thread Dyer, James
There are 2 parameters you want to consider: First is spellcheck.maxResultsForSuggest. Because you have an OR query, you'll get hits if only 1 query term is in the index. This parameter lets you tune it to make it suggest if the query returns n or fewer hits. My memory tells me, however,

RE: Data Import Handler

2013-11-06 Thread Dyer, James
If you prepend the variable name with dataimporter.request, you can include variables like these as request parameters: dataSource name=ds driver=${dataimporter.request.driver} url=${dataimporter.request.url} / /dih?driver=some.driver.classurl=jdbc:url:something If you want to include these

RE: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dyer, James
Would an onImportEnd event listener serve your needs? See http://wiki.apache.org/solr/DataImportHandler#EventListeners James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Dileepa Jayakody [mailto:dileepajayak...@gmail.com] Sent: Tuesday, October 29, 2013 3:48 PM To:

RE: Spellcheck with Distributed Search (sharding).

2013-10-24 Thread Dyer, James
Is it that your request handler is named /suggest but you are setting shards.qt to /suggestion ? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Luis Cappa Banda [mailto:luisca...@gmail.com] Sent: Thursday, October 24, 2013 6:22 AM To:

  1   2   3   4   >