Solr 4.x EdgeNGramFilterFactory and highlighting

2014-01-31 Thread Dmitriy Shvadskiy
hough EdgeNGramFilterFactory generates multiple ngrams (see image attached). Questions: 1. How do I accomplish highlighting on partial match in Solr 4.x? 2. Did behavior of EdgeNGramFilterFactory change between 3.6 and 4.x. Highlighting on partial match works fine in Solr 3.6 Thank you, Dmitriy Shvads

Re: Problem running Solr indexing in Amazon EMR

2013-08-12 Thread Dmitriy Shvadskiy
Michael, We replaced Lucene jars but run into a problem with incompatible version of Apache HttpComponents. Still figuring it out. Dmitriy -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-running-Solr-indexing-in-Amazon-EMR-tp4083636p4084121.html Sent from the Solr

Re: Problem running Solr indexing in Amazon EMR

2013-08-12 Thread Dmitriy Shvadskiy
Michael, Amazon Hadoop distribution has Lucene 2.9.4 jars in /lib directory and they conflict with Solr 4.4 we are using. Once we pass that problem we run into conflict with Apache HttpComponents you describe. I think the best bet would be for us to build our own AMI to avoid these dependencies.

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Dmitriy Shvadskiy
Erick, It actually suppose to be just one version of Solr that is bundled with our map/reduce jar. To be clear: Map/Reduce job is generating a new index, not reading an existing one. But it fails even before as an instance of EmbeddedSolrServer is created at the first line of the following code.

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Dmitriy Shvadskiy
Erick, Thank you for the reply. Cloudera image includes Solr 4.3. I'm not sure what version Amazon EMR includes. We are not directly referencing or using their version of Solr but instead build our jar against Solr 4.4 and include all dependencies in our jar file. Also error occurs not while read

Problem running Solr indexing in Amazon EMR

2013-08-09 Thread Dmitriy Shvadskiy
Hello, We are trying to utilize Amazon Elastic Map Reduce to build Solr indexes. We are using embedded Solr in the Reduce phase to create the actual index. However we run into a following error and not sure what is causing it. Solr version is 4.4. The job runs fine locally in Cloudera CDH 4.3 VM T

Re: Dismax mm per field

2011-08-03 Thread Dmitriy Shvadskiy
e an mm with > dismax, but mm 100% might be what you mean. Of course, one of those > queries could also not be dismax at all, but ordinary lucene query > parser or anything else. And of course you could have the same query > text for nested queries repeating eg "blah blah" in both

Dismax mm per field

2011-08-03 Thread Dmitriy Shvadskiy
Hello, Is there a way to apply (e)dismax mm parameter per field? If I have a query field1:(blah blah) AND field2:(foo bar) is there a way to apply mm only to field2? Thanks, Dmitriy -- View this message in context: http://lucene.472066.n3.nabble.com/Dismax-mm-per-field-tp3222594p3222594.html S

Boosting non synonyms result

2011-05-17 Thread Dmitriy Shvadskiy
Hello, Is there a way to boost the result that is an exact match as oppose to synonym match when using query time synonyms? Given the query John Smith and synonyms Jonathan,Jonathan,John,Jon,Nat,Nathan I'd like result containing John Smith to be ranked higher then Jonathan Smith. My thinking was t

Re: Specifying returned fields

2011-01-12 Thread Dmitriy Shvadskiy
Thanks Gora The workaround of loading fields via LukeRequestHandler and building fl from it will work for what we need. However it takes 15 seconds per core and we have 15 cores. The query I'm running is /admin/luke?show=schema Is there a way to limit query to return just fields? Thanks, Dmitriy

Specifying returned fields

2011-01-12 Thread Dmitriy Shvadskiy
Hello, I know you can explicitly specify list of fields returned via fl=field1,field2,field3 Is there a way to specify "return all fields but field1 and field2"? Thanks, Dmitriy

Best way to check Solr index for completeness

2010-09-28 Thread Dmitriy Shvadskiy
Hello, What would be the best way to check Solr index against original system (Database) to make sure index is up to date? I can use Solr fields like Id and timestamp to check against appropriate fields in database. Our index currently contains over 2 mln documents across several cores. Pulling all