RE: problem with data import handler delta import due to use of multiple datasource

2013-10-08 Thread Dyer, James
Bill, I do not believe there is any way to tell it to use a different datasource for the parent delta query. If you used this approach, would it solve your problem: http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport ? James Dyer Ingram Content Group (615) 213-4311

RE: How to achieve distributed spelling check in SolrCloud ?

2013-10-08 Thread Dyer, James
Shamik, Are you using a request handler other than /select, and if so, did you set shards.qt in your request? It should be set to the name of the request handler you are using. See http://wiki.apache.org/solr/SpellCheckComponent?#Distributed_Search_Support James Dyer Ingram Content Group

RE: Using CachedSqlEntityProcessor with delta imports in DIH

2013-09-24 Thread Dyer, James
I think delta imports only work on the parent entity and cached child entities will load in full, even if you only need to look up a few rows for the delta. Others though might have a way to get this to work. Here's two possible workarounds. On the child entity, specify: entity

RE: Solr DIH call a java class

2013-09-24 Thread Dyer, James
You probably want to write a custom Transformer. See: http://wiki.apache.org/solr/DIHCustomTransformer Or maybe a custom Evaluator. See: http://wiki.apache.org/solr/DataImportHandler#Evaluators_-_Custom_formatting_in_queries_and_urls Possibly one or more of the built-in Transformers will do

RE: Spellchecking

2013-09-20 Thread Dyer, James
If you're using spellcheck.collate you can also set spellcheck.maxCollationTries to validate each collation against the index before suggesting it. This validation takes into account any fq parameters on your query, so if your original query has fq=Product:Book, then the collations returned

RE: Spellcheck compounded words

2013-09-16 Thread Dyer, James
Which version of Solr are you running? (the post you replied to was about Solr 3.3, but the latest version now is 4.4.) Please provide configuration details and the query you are running that causes the problem. Also explain exactly what the problem is (query never returns?). Also explain

RE: Spellcheck compounded words

2013-09-16 Thread Dyer, James
threat... do you want me to copy it here? or could you see that? The subject is *spellcheck causing Core Reload to hang*. On Mon, Sep 16, 2013 at 5:50 PM, Dyer, James james.d...@ingramcontent.comwrote: Which version of Solr are you running? (the post you replied to was about Solr 3.3

RE: Spell check SOLR 3.6.1 not working for numbers

2013-07-26 Thread Dyer, James
name=suggestionstr89566325415/str/arr/lststr name=collation89566325415 /str/lst/lst/response Regards, Poornima From: Dyer, James james.d...@ingramcontent.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thursday, 25 July 2013 9:03 PM

RE: Spell check SOLR 3.6.1 not working for numbers

2013-07-25 Thread Dyer, James
I think the default SpellingQueryConverter has a hard time with terms that contain numbers. Can you provide a failing case...the query you're executing (with all the spellcheck.xxx params) and the spellcheck response (or lack thereof). Is it producing any hits? James Dyer Ingram Content

RE: spellcheck and search in a same solr request

2013-07-23 Thread Dyer, James
Solr doesn't support any kind of short-circuting the original query and returning the results of the corrected query or collation. You just re-issue the query in a second request. This would be a nice feature to add though. James Dyer Ingram Content Group (615) 213-4311 -Original

RE: Use same spell check dictionary across different collections

2013-07-23 Thread Dyer, James
DirectSolrSpellChecker does not prepare any kind of dictionary. It just uses the term dictionary from the indexed field. So what you are trying to do is impossible. You would think it would be possible with IndexBasedSpellChecker because it creates a dictionary as a sidecar lucene index.

RE: Spellcheck field element and collation issues

2013-07-23 Thread Dyer, James
For this query: http://localhost:8981/solr/articles/select?indent=trueq=Perfrm%20HVCrows=0 ...do you get anything back in the spellcheck response? Is it correcting the individual words and not giving collations? Or are you getting no individual word suggestions also? James Dyer Ingram

RE: Spellcheck field element and collation issues

2013-07-23 Thread Dyer, James
Thanks Brendan On Tue, Jul 23, 2013 at 3:19 PM, Dyer, James james.d...@ingramcontent.comwrote: For this query: http://localhost:8981/solr/articles/select?indent=trueq=Perfrm%20HVCrows=0 ...do you get anything back in the spellcheck response? Is it correcting the individual words

RE: Spellcheck field element and collation issues

2013-07-23 Thread Dyer, James
is this for: str name=fieldspellcheck/str is it even needed if I've specified how the spelling index terms should analyzed with: str name=queryAnalyzerFieldTypetext_spell/str Thanks again Brendan On Tue, Jul 23, 2013 at 3:58 PM, Dyer, James james.d...@ingramcontent.comwrote: Try

RE: Spellcheck field element and collation issues

2013-07-23 Thread Dyer, James
or spellcheck.q parameter and this tokenized text is the input the spellcheckchecking instance. Does that sound right? Thanks Brendan On Tue, Jul 23, 2013 at 5:15 PM, Dyer, James james.d...@ingramcontent.comwrote: I don't believe you can specify more than 1 field on df (default field

RE: Config changes in solr.DirectSolrSpellCheck after index is built?

2013-07-17 Thread Dyer, James
DirectSorlSpellChecker does not create a dictionary. It uses the field you specify and uses the Lucene term dictionary. It uses the some of the same code Fuzzy Search uses to calculate distance between user input and indexed terms. If you're wondering about the affect of configuration changes

RE: Question about weighted spell check

2013-07-05 Thread Dyer, James
The current implementation doesn't sort strictly on hit-counts. Rather it gives you collations that have corrections with thenearest distance from the original terms. Sorting on query result score sounds like an interesting and do-able alternative, although not supported currently. The

RE: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Dyer, James
Instead of specifying CachedSqlEntityProcessor, you can specify SqlEntityProcessor with cacheImpl='SortedMapBackedCache'. If you parametertize this, to have SortedMapBackedCache for full updates but blank for deltas I think it will cache only on the full import. Another option is to

RE: How spell checker used if indexed document is containing misspelled words

2013-06-18 Thread Dyer, James
There are two newer parameters that work better than onlyMorePopular: spellcheck.alternativeTermCount - This is the # of suggestions you want for terms that exist in the index. You can set it the same as spellcheck.count, or less if you don't want as many suggestions for these.

RE: Spell Checker (DirectSolrSpellChecker) correct settings

2013-06-03 Thread Dyer, James
My first guess is that no documents match the query provinical court. Because you have spellcheck.maxCollationTries set to a non-zero value, it will not return these as collations unless the correction will return hits. You can test my theory out by removing spellcheck.maxCollationTries from

RE: Spell Checker (DirectSolrSpellChecker) correct settings

2013-06-03 Thread Dyer, James
for court and returns result. 4) Search for Provinciall Courtt = correct suggestions.. On Mon, Jun 3, 2013 at 7:55 PM, Dyer, James james.d...@ingramcontent.comwrote: My first guess is that no documents match the query provinical court. Because you have spellcheck.maxCollationTries set to a non

RE: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread Dyer, James
I don't want to dissuade you from trying but I believe FileListEntityProcessor has something special coded up into it to allow for its unique usage. Not sure if your approach isn't do-able. I would imagine that fixing FLEP to handle a row-at-a-time or page-at-a-time in memory wouldn't be

RE: Why do FQs make my spelling suggestions so slow?

2013-05-29 Thread Dyer, James
Andy, I opened this ticket so that someone can eventaully invistigate: https://issues.apache.org/jira/browse/SOLR-4874 Just an instanity check, I see I had misspelled maxCollations as maxCollation in my prior response. When you tested with this set the same as maxCollationTries, did you

RE: Choosing specific fields for suggestions in SpellCheckerComponent

2013-05-29 Thread Dyer, James
I assume here you've got a spellcheck field like this: field name=Spelling_Dictionary type=text_general/ copyField source=field1 dest=Spelling_Dictionary / copyField source=field2 dest=Spelling_Dictionary / copyField source=field3 dest=Spelling_Dictionary / copyField source=field4

RE: Why do FQs make my spelling suggestions so slow?

2013-05-29 Thread Dyer, James
a...@petdance.com wrote: On May 29, 2013, at 9:46 AM, Dyer, James james.d...@ingramcontent.com wrote: Just an instanity check, I see I had misspelled maxCollations as maxCollation in my prior response. When you tested with this set the same as maxCollationTries, did you correct my spelling

RE: Why do FQs make my spelling suggestions so slow?

2013-05-29 Thread Dyer, James
, 2013 12:41 PM To: solr-user@lucene.apache.org Subject: Re: Why do FQs make my spelling suggestions so slow? James, this is very useful information. Can you please add this to the wiki? On Wed, May 29, 2013 at 10:36 PM, Dyer, James james.d...@ingramcontent.comwrote: Instead of maxCollationTries

RE: Why do FQs make my spelling suggestions so slow?

2013-05-28 Thread Dyer, James
Andy, What are the QTimes for the 0fq,1fq,2fq,4fq 4fq cases with spellcheck entirely turned off? Is it about (or a little more than) half the total when maxCollationTries=1 ? Also, with the varying # of fq's, how many collation tries does it take to get 10 collations? Possibly, a better

RE: Bug in spellcheck.alternativeTermCount

2013-05-23 Thread Dyer, James
Can you give instructions on how to reproduce problem? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Rounak Jain [mailto:rouna...@gmail.com] Sent: Thursday, May 23, 2013 7:36 AM To: solr-user@lucene.apache.org Subject: Bug in spellcheck.alternativeTermCount I

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread Dyer, James
as: field name=Category1 type=string indexed=true stored=true multiValued=true/ I am curious to what I am doing wrong. I should mention that I am using Solr 4.0.0. I know a more recent version is out – but I don’t think it should make a difference. Thank you again for your help. O. O. Dyer

RE: How do I use CachedSqlEntityProcessor?

2013-05-22 Thread Dyer, James
column=CategoryName name=Category1 / /entity Similarly for other Categories i.e. Category2, Category3, etc. I am now going to try this for a larger dataset. I hope this works. O.O. Dyer, James-2 wrote There was a mistake in my last reply. Your child entities need to SELECT

RE: How do I use CachedSqlEntityProcessor?

2013-05-21 Thread Dyer, James
First remove the where condition from the child entities, then use the cacheKey and cacheLookup parameters to instruct DIH how to do the join. Example: entity name=Cat1 cacheKey=SKU cacheLookup=Product.SKU query=SELECT CategoryName from CAT_TABLE where CategoryLevel=1 / See

RE: Speed up import of Hierarchical Data

2013-05-17 Thread Dyer, James
these basics? O. O. Dyer, James-2 wrote See https://issues.apache.org/jira/browse/SOLR-2943 . You can set up 2 DIH handlers. The first would query the CAT_TABLE and save it to a disk-backed cache, using DIHCacheWriter. You then would replace your 3 child entities in the 2nd DIH handler to use

RE: Speed up import of Hierarchical Data

2013-05-16 Thread Dyer, James
See https://issues.apache.org/jira/browse/SOLR-2943 . You can set up 2 DIH handlers. The first would query the CAT_TABLE and save it to a disk-backed cache, using DIHCacheWriter. You then would replace your 3 child entities in the 2nd DIH handler to use DIHCacheProcessor to read back the

RE: delta-import and cache (a story in conflict)

2013-05-14 Thread Dyer, James
The reason it is writing all the imput fields for that document is this particular error message appends doc to the end, which is a subclass of SolrInputDocument, which has a toString that shows all the fields. Not sure if this in particular changed, but I suspect this is a symptom not a

RE: Spellchecker: Is it possible to return search results with the first suggestion as query string instead of a list of suggestions?

2013-05-14 Thread Dyer, James
To get a re-written query with the top suggestions, specify spellcheck.collate=true. Begin reading from here (http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate) to see all the options you have related to collate. Solr cannot return results from a collation automatically.

RE: delta-import and cache (a story in conflict)

2013-05-14 Thread Dyer, James
by cacheKey to lookup all records. -Original Message- From: Dyer, James [mailto:james.d...@ingramcontent.com] Sent: Tuesday, May 14, 2013 4:08 PM To: solr-user@lucene.apache.org Subject: RE: delta-import and cache (a story in conflict) The reason it is writing all the imput fields

RE: Looking for Best Practice of Spellchecker

2013-05-13 Thread Dyer, James
? Thanks On Fri, May 10, 2013 at 11:34 AM, Dyer, James james.d...@ingramcontent.comwrote: Good point, Jason. In fact, even if you use WorkBreakSpellChecker wall mart will not correct to walmart. The reason is the spellchecker cannot both correct a token's spelling *and* fix the wordbreak

RE: Negative Boosting at Recent Versions of Solr?

2013-05-10 Thread Dyer, James
Despite the discussion in SOLR-3823/SOLR-3278, my experience with Solr 4.2 is that it does indeed allow negative boosts on both bf and qf. I think the functionality was added under the radar possibly with SOLR-4093, not sure though. In disbelief, I did some testing and it seems to really

RE: Looking for Best Practice of Spellchecker

2013-05-10 Thread Dyer, James
Nicholas, It sounds like you might want to use WordBreakSolrSpellChecker, which gets obscure mention in the wiki. Read through this section: http://wiki.apache.org/solr/SpellCheckComponent#Configuration and you will see some information. Also, the Solr Example shows how to configure this.

RE: Looking for Best Practice of Spellchecker

2013-05-10 Thread Dyer, James
, at 7:32 AM, Dyer, James james.d...@ingramcontent.com wrote: Nicholas, It sounds like you might want to use WordBreakSolrSpellChecker, which gets obscure mention in the wiki. Read through this section: http://wiki.apache.org/solr/SpellCheckComponent#Configuration and you will see some

RE: spellcheker and exact match

2013-05-08 Thread Dyer, James
Try setting spellcheck.alternativeTermCount to a nonzero value. See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount The issue may be that by default, the spellchecker will never try to offer suggestions for a term that exists in the dictionary. So if some other

RE: java.lang.NullPointerException. I am trying to use CachedSqlEntityProcessor

2013-05-01 Thread Dyer, James
If I remember correctly, 3.6 DIH had bugs related to CachedSqlEntityProcessor and some were fixed in 3.6.1, 3.6.2, but some were not fixed until 4.0. You might want to use a 3.5 DIH jar with your 3.6 Solr. Or, post your data-config.xml and maybe someone can figure something out. James Dyer

RE: java.lang.NullPointerException. I am trying to use CachedSqlEntityProcessor

2013-04-29 Thread Dyer, James
This sounds like https://issues.apache.org/jira/browse/SOLR-3791, which was resolved in 3.6.2 / 4.0. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: srinalluri [mailto:nallurisr...@yahoo.com] Sent: Monday, April 29, 2013 11:41 AM To: solr-user@lucene.apache.org

RE: Using another way instead of DIH

2013-04-26 Thread Dyer, James
yes, I misspoke. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: xiaoqi [mailto:belivexia...@gmail.com] Sent: Thursday, April 25, 2013 8:37 PM To: solr-user@lucene.apache.org Subject: RE: Using another way instead of DIH Thanks for help . data-config.xml ? i

RE: Using another way instead of DIH

2013-04-26 Thread Dyer, James
Here are some things I would try: 1. Make sure the parent entity is only returning 1 row per solr document. If not, move the problems joins to child entities to their own queries and child entities. 2. For the child entites, use caching. This prevents the n+1 select problem. The changes

RE: What is the difference between a Join Query and Embedded Entities in Solr DIH?

2013-04-25 Thread Dyer, James
Gustav, DIH should give you the same results in both scenarios. The performance trade-offs depend on your data. In your case, it looks like there is a 1-to-1 or many-to-1 relationship between item and member, so use the SQL Join. You'll get all of your data in one query and you'll be using

RE: Using another way instead of DIH

2013-04-25 Thread Dyer, James
If you post your data-config.xml here, someone might be able to find something you could change to speed things up. If the issue is parallelization, then you could possibly partition your data somehow and then run multiple DIH request handlers at the same time. This might be easier than

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-24 Thread Dyer, James
When getting collations there are two steps. First, the spellchecker gets individual word choices for each misspelled word. By default, these are sorted by string distance first, then document frequency second. You can override this by specifying str name=comparatorClassfreq/str in your

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-23 Thread Dyer, James
If you enable debug-level logging for class org.apache.solr.spelling.SpellCheckCollator, you should get a log message for every collation it tries like this: Collation: will return zzz hits. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: SandeepM

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-22 Thread Dyer, James
On both queries, set spellcheck.extendedResults=true and also spellcheck.collateExtendedResults=true, then post the full spelling response. Also, how long does each query take on average with spellcheck turned off? James Dyer Ingram Content Group (615) 213-4311 -Original Message-

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-22 Thread Dyer, James
This doesn't make a lot of sense to me as in both cases the very first collation it tries is the one it is returning. So you're getting a very optimized spellcheck in both cases. But it does have to issue both queries 2 times: the first time, it tries the user's main query anding there are

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-19 Thread Dyer, James
I guess the first thing I'd do is to set maxCollationTries to zero. This means it will only run your main query once and not re-run it to check the collations. Now see if your queries have consistent qtime. One easy explanation is that with maxCollationTries=10, it may be running your query

RE: DirectSolrSpellChecker : vastly varying spellcheck QTime times.

2013-04-19 Thread Dyer, James
I do not know what it would take to have the collation tests make betetr use of the QueryResultCache. However, outside of a test scenario, I do not know if this would help a lot. Hopefully you wouldn't have a lot of users issuing the exact same query with the exact same misspelled words over

RE: Spellchecker not working for Solr 4.1

2013-04-17 Thread Dyer, James
Spellcheck is broken when using both distributed and grouping. The fix is here: https://issues.apache.org/jira/browse/SOLR-3758 . This will be part of 4.3, which likely will be released within the next few weeks. In the mean time you can apply the patch to 4.2 or as a workaround, re-issue a

RE: FileBasedSpellchecker with Frequency comaprator

2013-04-08 Thread Dyer, James
I do not believe that FileBasedSpellchecker takes frequency into account at all. That would be a nice enhancement though. To get what you wanted, you could index one or more documents containing the words in your file then create a spellchecker using IndexBasedSpellChecker or

RE: Solr Multiword Search

2013-04-05 Thread Dyer, James
To get did-you-mean suggestions, use both spellcheck.alternativeTermCount 0 along with spellcheck.maxResultsForSuggest 0. Set this later parameter to the max # of hits you want to trigger did-you-mean suggestions. See

RE: Solr Multiword Search

2013-04-04 Thread Dyer, James
If you are using dismax/edismax with mm=0 (or some other low number), you should override this in the spellchecker. Specify spellcheck.collateParam.mm=100%, or something high like that. Likewise if you're using the default lucene/solr query parser with q.op=OR, then you can specify

RE: Spell check component does not return any suggestions

2013-04-04 Thread Dyer, James
Make sure you also set spellcheck.onlyMorePopular=false (or leave it out as false is the default) when using spellcheck.alternativeTermCount. You may also need to set spellcheck.maxResultsForSuggest=0. See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxResultsForSuggest to

RE: Solr Multiword Search

2013-04-04 Thread Dyer, James
Use IndexBasedSpellChecker instead of DirectSolrSpellChecker if you need more than 2 edits. You may need to set the accuracy parameter lower than the default of .5 Keep in mind that while this might get the correct responses for your test cases, in the wild your users might find their

RE: how to avoid single character to get indexed for directspellchecker dictionary

2013-04-04 Thread Dyer, James
I assume if your user queries delll and it breaks it into pieces like de l l l, then you're probably using WordBreakSolrSpellChecker in addition to DirectSolrSpellChecker, right? If so, then you can specify minBreakLength in solrconfig.xml like this: searchComponent name=spellcheck

RE: Solr Multiword Search

2013-04-03 Thread Dyer, James
You have specified spellcheck.q in your query. The whole purpose of spellcheck.q is to bypass any query converter you've configured giving it raw keywords instead. But possibly a custom query converter is not your best answer? I agree that charles charlie is an edit distance of 2, so if

RE: SOLR - Unable to execute query error - DIH

2013-03-28 Thread Dyer, James
You may want to run your jdbc driver in trace mode just to see if it is picking up these different options. I know from experience that the selectMethod parameter can sometimes be important to prevent SQLServer drivers from caching the entire resultset in memory. But something seems very

RE: Is deltaQuery mandatory ?

2013-03-28 Thread Dyer, James
You do not need deltaQuery unless you're doing delta (incremental) updates. To configure a full import, try starting with this example: http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: A. Lotfi

RE: SOLR - Unable to execute query error - DIH

2013-03-25 Thread Dyer, James
With MS SqlServer, try adding selectMethod=cursor to your conenction string and set your batch size to a reasonable amount (possibly just omit it and DIH has a default value it will use.) James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: kobe.free.wo...@gmail.com

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread Dyer, James
to the class that may be responsible to this issue? Thanks. Alex. -Original Message- From: Dyer, James james.d...@ingramcontent.com To: solr-user solr-user@lucene.apache.org Sent: Thu, Mar 21, 2013 11:23 am Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud The shard

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-22 Thread Dyer, James
is not clear. Thanks. Alex. -Original Message- From: Dyer, James james.d...@ingramcontent.com To: solr-user solr-user@lucene.apache.org Sent: Fri, Mar 22, 2013 2:08 pm Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud Alex, I added your comments to SOLR-3758 (https

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-21 Thread Dyer, James
lst name=grouped lst name=site int name=matches0/int int name=ngroups0/int arr name=groups/ /lst /lst lst name=highlighting/ lst name=spellcheck lst name=suggestions/ /lst /response Thanks. Alex. ---Original Message- From: Dyer, James james.d...@ingramcontent.com To: solr

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-19 Thread Dyer, James
Can you try including in your request the shards.qt parameter? In your case, I think you should set it to testhandler. See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support for a brief discussion. James Dyer Ingram Content Group (615)

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-19 Thread Dyer, James
as well. - Mark On Mar 19, 2013, at 12:18 PM, Dyer, James james.d...@ingramcontent.com wrote: Can you try including in your request the shards.qt parameter? In your case, I think you should set it to testhandler. See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29

RE: strange behaviour of wordbreak spellchecker in solr cloud

2013-03-19 Thread Dyer, James
request but it did not solve the issue. Thanks. Alex. -Original Message- From: Dyer, James james.d...@ingramcontent.com To: solr-user solr-user@lucene.apache.org Sent: Tue, Mar 19, 2013 10:30 am Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud Mark, I wasn't

RE: SOLR - Define fields in DIH configuration file dynamically

2013-03-18 Thread Dyer, James
There are 3 approaches I can think of: 1. You can generate a new data-config.xml for each import. With Solr 4.0 and later, DIH re-parses your data-config.xml and picks up any changes automatically. 2. You can parameterize nearly anything in data-config.xml, add the parameters to your request

RE: solr-dih does multiple queries for sub-entities

2013-03-04 Thread Dyer, James
You can cache the subentity, then it will retrieve all the data for that entity in 1 query. See http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor for more information. This section focuses on caching data from SQLEntityProcessor. However, it is now possible to cache

RE: Get page number of searchresult of a pdf in solr

2013-03-01 Thread Dyer, James
Is there an easy (enough) way to do this, storing the page number as a payload on each term? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] Sent: Thursday, February 28, 2013 3:33 PM To:

RE: can we configure spellcheck to be invoked after request processing?

2013-03-01 Thread Dyer, James
I'm a little confused here because if you are searching q=jeap OR denim , then you should be getting both documents back. Having spellcheck configured does not affect your search results at all. Having it in your request will sometime result in spelling suggestions, usually if one or more

RE: If we Open Source our platform, would it be interesting to you?

2013-02-20 Thread Dyer, James
I only looked at your link super fast, but this seems like a very viable alternative to Solr's DIH. DIH does the job fairly well but we've struggled to have developers who are willing to maintain it. The problem, I think, is that DIH appeals to non-programmers who want to index their data

RE: Solr 4.1.0 not using solrcore.properties ?

2013-02-14 Thread Dyer, James
be documented in a way that persons can find this documentation, i guess it would be better to just allow periods by changing the implementation of the VariableResolver just a little... 00.43 now... off to bed. Let me know what you think, Daniel On Wed, Feb 13, 2013 at 6:45 PM, Dyer, James james.d

RE: Implement price range filter: DataImportHandler started. Not Initialized. No commands can be run

2013-02-14 Thread Dyer, James
This looks like https://issues.apache.org/jira/browse/SOLR-2115 , which was fixed for 4.0-Alpha . Bascially, if you do not put a data-config.xml file in the defaults section in solrconfig.xml, or if your config file has any errors, you won't be able to use DIH unless you fix the problem and

RE: Implement price range filter: DataImportHandler started. Not Initialized. No commands can be run

2013-02-14 Thread Dyer, James
No, you still have to fix problems with data-config.xml. Just that prior to 4.0-alpha if you started solr with a problem in the config, you had no way to fix it and refreshing without restarting solr (or at least doing a core reload). With 4.0, you can fix your config file and just retry. I

RE: Solr 4.1.0 not using solrcore.properties ?

2013-02-13 Thread Dyer, James
The code that resolves variables in DIH was refactored extensively in 4.1.0. So if you've got a case where it does not resolve the variables properly, please give the details. We can open a JIRA issue and get this fixed. James Dyer Ingram Content Group (615) 213-4311 -Original

RE: suggest only from certain documents

2013-02-13 Thread Dyer, James
The key to get this working is to set spellcheck.maxCollationTries 0. It will generate collations even if there is only 1 term. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, February 13,

RE: DIH and splitBy

2013-01-31 Thread Dyer, James
In your unit test, you have: field column=\type\ name=\type\ splitBy=\\\|\ / + And also: runner.update(INSERT INTO test VALUES 1, 'foo,bar,baz'); So you need to decide if you want to delimit with a pipe or a comma. James Dyer Ingram Content Group (615) 213-4311 -Original Message-

RE: Variable expansion in DIH SimplePropertiesWriter's filename?

2013-01-30 Thread Dyer, James
This is a bug. Can you paste what you've said here into a new JIRA issue? https://issues.apache.org/jira/browse/SOLR James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jonas Birgander [mailto:jonas.birgan...@prisjakt.nu] Sent: Wednesday, January 30, 2013 4:54 AM

RE: SOLR 4 getting stuck during restart

2013-01-25 Thread Dyer, James
I think you really need to see a thread dump when it gets stuck to know what's going on. My original thought this was a problem with the index-based spellchecker and wouldn't affect direct- . (for DirectSolrSpellChekcer, spellcheck.build is a no-op as there is no separate index or dictionary

RE: Error in DIH after upgrading from 4.0 to 4.1

2013-01-25 Thread Dyer, James
This is a bug. Thank you for reporting it. I opened this ticket: https://issues.apache.org/jira/browse/SOLR-4361 Until there is a fix, here are two workarounds: 1. If you do not need any 4.1 DIH functionality, use the 4.0 DIH jar with your 4.1 Solr. -or- 2. Use request parameters without

RE: Deletion from database

2013-01-24 Thread Dyer, James
This post on stackoverflow has a good run-down on your options: http://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents/1557604#1557604 If you're using DIH, you can get more information from: http://wiki.apache.org/solr/DataImportHandler The easiest thing, if using a

RE: Problems with DataImportHandler in SOLR 1.4.0

2013-01-22 Thread Dyer, James
I'm not sure why you have this problem. I use DIH 1.4.1 in production with Jboss 5 (based on Tomcat) and seldom restart the JVMs and haven't experienced anything like this. As for the warnings with ThreadLocals, I doubt these are causing a severe memory leak: in 1.4.1, the DataImporter class

RE: SOLR 4 getting stuck during restart

2013-01-21 Thread Dyer, James
Are you trying to build the dictionary using a warming query? I think I saw this happen once before a long time ago. I think if you are issuing a warming query with spellcheck.build=true, then you might also want to use spellcheck.collate=false. If this fixes it, could you open a bug report

RE: DataImportHandlerException: Unable to execute query with OPTIM

2013-01-15 Thread Dyer, James
I think your JDBC driver is complaining because it doesn't like what is being set for the fetch size on the Statement. Fetch size is controlled by the batchSize parameter on dataSource / . Using batchSize=-1, I believe, is a workaround for MySql but I suspect your driver requires it to be 0

RE: how to perform a delta-import when related table is updated

2013-01-11 Thread Dyer, James
Peter, See http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command , then scroll down to where it says The deltaQuery in the above example only detects changes in item but not in other tables... It shows you two ways to do it. Option 1: add a reference to the

RE: Transformers and Nested entities - order of execution

2013-01-10 Thread Dyer, James
Alexandre, Unfortunately this is poorly documented and it takes a little trian-and-error to figure out what is going on. I believe the order is this: 1. Get data from the EntityProcessor (in your case, MailEntityProcessor) 2. Run transformers on the data. 3. Run and post-transform operations

RE: OR query

2013-01-10 Thread Dyer, James
If the fields you're querying are of type String (1 token), then you need to escape the whitespace with a backslash, like this: label:ian\ paisley If they are of type Text (multiple tokens), sometimes you need to explicitly insert AND between each token, either with: label:(ian AND paisley)

RE: OR query

2013-01-10 Thread Dyer, James
=q *:* OR (constituencies:(ian paisley) OR label:(ian paisley) OR office:(ian paisley)) /str /lst I do get results, but I'm not sure if putting *:* at the start will break things down the line with other queries. On Thu, Jan 10, 2013 at 6:36 PM, Dyer, James james.d...@ingramcontent.comwrote

RE: search features Endeca vs Solr

2013-01-04 Thread Dyer, James
Sachin, You might more response on this list is you can describe a little in detail what your application needs to do. A lot of us haven't used Endeca and won't understand exactly what you mean here. With that said, I migrated a few apps from Endeca to Solr a few years back and will try to

RE: Converting fq params to Filter object

2012-12-27 Thread Dyer, James
Nalini, You could take the code from SpellCheckCollator#collate and have it issue a test query for each word individually instead of for each collation. This would do exactly what you want. See

RE: Converting fq params to Filter object

2012-12-27 Thread Dyer, James
are default OR. Here's the original thread about this - http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccamqozyftgiwyrbvwsdf0hfz1sznkq9gnbjfdb_obnelsmvr...@mail.gmail.com%3E Thanks, Nalini On Thu, Dec 27, 2012 at 2:46 PM, Dyer, James james.d...@ingramcontent.comwrote: https

RE: [DIH] Script Transformer: Is there a way to import js file?

2012-12-26 Thread Dyer, James
I'm not very familiar with using scipting langauges with Java, but having seen the DIH code for this, my guess is that all script code needs to be in the script / section of data-config.xml. So I don't think what you want is possible. This seems like the kind of thing that would be useful if

RE: Can DataImportHandler ignore Missing Tags in XML?

2012-12-21 Thread Dyer, James
It looks from your stack trace that your XML document has a value for ChemicalNameOfSubstance yet you do not have such a column defined in schema.xml. Is this your problem? The easiest way to get Solr to ignore extra fields that you do not wish to index or store is to add a catch-all dynamic

RE: Ensuring SpellChecker returns corrections which satisfy fq params for default OR query

2012-12-20 Thread Dyer, James
the spellcheck.maxResultsForSuggest param helps with making sure that the suggestions returned satisfy the fq params? That's the main problem we're trying to solve, how often suggestions are being returned is not really an issue for us at the moment. Thanks, Nalini On Wed, Dec 19, 2012 at 4:35 PM, Dyer, James james.d

RE: order question on solr multi value field

2012-12-19 Thread Dyer, James
, Dyer, James james.d...@ingramcontent.com wrote: I would say such a guarantee is implied by the javadoc to Analyzer#getPositionIncrementGap . It says this value is an increment to be added to the next token emitted from tokenStream. http://lucene.apache.org/core/4_0_0-ALPHA/core/org/apache

RE: Ensuring SpellChecker returns corrections which satisfy fq params for default OR query

2012-12-19 Thread Dyer, James
Let me try and get a better idea of what you're after. Is it that your users might query a combination of irrelevant terms and misspelled terms, so you want the ability to ignore the irrelevant terms but still get suggestions for the misspelled terms? For instance if someone wanted

<    1   2   3   4   >