date:20110218


Hello Solr-friends,

I want to implement a query-expander, one that enriches the input by the usage 
of extra parameters that, for example, a form may provide.

Is the right way to subclass SearchHandler?
Or rather to subclass QueryComponent?

thanks in advance

paul

Dih sproc call

I an trying to call a stored procedure using query= in DIH. I tried exec name, 
call name, and name and none works.

This is SQL server 2008.

Bill Bell
Sent from mobile


On Feb 18, 2011, at 10:27 AM, Paul Libbrecht p...@hoplahup.net wrote:

 
 Hello Solr-friends,
 
 I want to implement a query-expander, one that enriches the input by the 
 usage of extra parameters that, for example, a form may provide.
 
 Is the right way to subclass SearchHandler?
 Or rather to subclass QueryComponent?
 
 thanks in advance
 
 paul

Re: Best way for a query-expander?

2011-02-18 Thread Em


Hi Paul,

what do you understand by saying extra parameters?

Regards


Paul Libbrecht-4 wrote:
 
 
 Hello Solr-friends,
 
 I want to implement a query-expander, one that enriches the input by the
 usage of extra parameters that, for example, a form may provide.
 
 Is the right way to subclass SearchHandler?
 Or rather to subclass QueryComponent?
 
 thanks in advance
 
 paul
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Best-way-for-a-query-expander-tp2528194p2528736.html
Sent from the Solr - User mailing list archive at Nabble.com.

Dih sproc does not work

I an trying to call a stored procedure using query= in DIH. I tried exec name, 
call name, and name and none works.

This is SQL server 2008.

Bill Bell
Sent from mobile

Re: Best way for a query-expander?

Erm... extra web-request-parameters simply.

paul


Le 18 févr. 2011 à 19:37, Em a écrit :

 
 Hi Paul,
 
 what do you understand by saying extra parameters?
 
 Regards
 
 
 Paul Libbrecht-4 wrote:
 
 
 Hello Solr-friends,
 
 I want to implement a query-expander, one that enriches the input by the
 usage of extra parameters that, for example, a form may provide.
 
 Is the right way to subclass SearchHandler?
 Or rather to subclass QueryComponent?
 
 thanks in advance

Re: Best way for a query-expander?

2011-02-18 Thread Tommaso Teofili

Hi Paul,
me and a colleague worked on a QParserPlugin to expand alias field names
to many existing field names
ex: q=mockfield:val == q=actualfield1:val OR actualfield2:val
but if you want to be able to use other params that come from the HTTP
request you should use a custom RequestHandler I think,
My 2 cents,
Tommaso


2011/2/18 Em mailformailingli...@yahoo.de


 Hi Paul,

 what do you understand by saying extra parameters?

 Regards


 Paul Libbrecht-4 wrote:
 
 
  Hello Solr-friends,
 
  I want to implement a query-expander, one that enriches the input by the
  usage of extra parameters that, for example, a form may provide.
 
  Is the right way to subclass SearchHandler?
  Or rather to subclass QueryComponent?
 
  thanks in advance
 
  paul
 

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Best-way-for-a-query-expander-tp2528194p2528736.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Understanding multi-field queries with q and fq

2011-02-18 Thread mrw



After searching this list, Google, and looking through the Pugh book, I am a
little confused about the right way to structure a query.

The Packt book uses the example of the MusicBrainz DB full of song metadata. 
What if they also had the song lyrics in English and German as files on
disk, and wanted to index them along with the metadata, so that each
document would basically have song title, artist, publisher, date, ...,
All_Metadata (copy field of all metadata fields), Text_English, and
Text_German fields?  

There can only be one default field, correct?  So if we want to search for
all songs containing (zeppelin AND (dog OR merle)) do we 

repeat the entire query text for all three major fields in the 'q' clause
(assuming we don't want to use the cache):

q=(+All_Metadata:zeppelin AND (dog OR merle)+Text_English:zeppelin AND (dog
OR merle)+Text_German:(zeppelin AND (dog OR merle))

or repeat the entire query text for all three major fields in the 'fq'
clause (assuming we want to use the cache):

q=*:*fq=(+All_Metadata:zeppelin AND (dog OR merle)+Text_English:zeppelin
AND (dog OR merle)+Text_German:zeppelin AND (dog OR merle))

?

Thanks!


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Understanding-multi-field-queries-with-q-and-fq-tp2528866p2528866.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Dih sproc does not work

When I use 'call sprocname' it does call the process, but I am not getting the 
select into Solr.

It shows 0 docs added. I am only returning 1 rs.

Bill Bell
Sent from mobile


On Feb 18, 2011, at 11:49 AM, Bill Bell billnb...@gmail.com wrote:

 I an trying to call a stored procedure using query= in DIH. I tried exec 
 name, call name, and name and none works.
 
 This is SQL server 2008.
 
 Bill Bell
 Sent from mobile

Re: Best way for a query-expander?


using rb.req.getParams().get(blip) inside prepare(ResponseBuilder)'s subclass 
of QueryComponent I could easily get the extra http request param.

However, how would I change the query?
using rb.setQuery(xxx) within that same prepare method seems to have no effect.


paul

Le 18 févr. 2011 à 19:51, Tommaso Teofili a écrit :

 Hi Paul,
 me and a colleague worked on a QParserPlugin to expand alias field names
 to many existing field names
 ex: q=mockfield:val == q=actualfield1:val OR actualfield2:val
 but if you want to be able to use other params that come from the HTTP
 request you should use a custom RequestHandler I think,
 My 2 cents,
 Tommaso
 
 
 2011/2/18 Em mailformailingli...@yahoo.de
 
 
 Hi Paul,
 
 what do you understand by saying extra parameters?
 
 Regards
 
 
 Paul Libbrecht-4 wrote:
 
 
 Hello Solr-friends,
 
 I want to implement a query-expander, one that enriches the input by the
 usage of extra parameters that, for example, a form may provide.
 
 Is the right way to subclass SearchHandler?
 Or rather to subclass QueryComponent?
 
 thanks in advance
 
 paul
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Best-way-for-a-query-expander-tp2528194p2528736.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr.KeepWordsFilterFactory confusion

2011-02-18 Thread Robert Haschart

Thanks for your response.  After making that change it seemed at first 
like it made no difference, after restarting the jetty server, and 
reindexing the test object, the display still shows:


arr name=format_facet
  strVideo/str
  strStreaming Video/str
  strOnline/str
  strGooberhead/str
  strBook of the Month/str
/arr

But it turns out that I had been making an incorrect assumption.  I was 
looking at the retruned stored values for the solr document, and seeing 
the Gooberhead entry listed, and thinking that the analyzer wasn't 
running.  However as I have subsequently figured out, the analyzers are 
not run on the data that is to be stored, only on the data that is to 
being indexed. 

So after making your change to that field type statement, if I search 
for   format_facet:Gooberhead   I get results = 0 which is what I'd 
expect.  But seeing that the unexpected values are still stored with the 
solr document, it seems that I will have to take a different approach.


Thanks again.

-Bob Haschart

Ahmet Arslan wrote:


I've added a new field type in schema.xml:

 fieldType name=formatFacet
class=solr.StrField sortMissingLast=true
omitNorms=true
analyzer type=index
 tokenizer
class=solr.KeywordTokenizerFactory/
   filter
class=solr.KeepWordFilterFactory words=format_facet.txt
ignoreCase=false /
/analyzer
  /fieldType

   



class=solr.StrField should be class=solr.TextField

Re: Best way for a query-expander?

it does work!

Le 18 févr. 2011 à 20:48, Paul Libbrecht a écrit :

 using rb.req.getParams().get(blip) inside prepare(ResponseBuilder)'s 
 subclass of QueryComponent I could easily get the extra http request param.
 
 However, how would I change the query?
 using rb.setQuery(xxx) within that same prepare method seems to have no 
 effect.

Sorry for the noise, it does have the exact desired effect.

Nice pattern.
I believe everyone needs query expansion except maybe if using Dismax.

paul

 
 Le 18 févr. 2011 à 19:51, Tommaso Teofili a écrit :
 
 Hi Paul,
 me and a colleague worked on a QParserPlugin to expand alias field names
 to many existing field names
 ex: q=mockfield:val == q=actualfield1:val OR actualfield2:val
 but if you want to be able to use other params that come from the HTTP
 request you should use a custom RequestHandler I think,
 My 2 cents,
 Tommaso

XML Stripping from DIH

2011-02-18 Thread Olson, Ron

Hi all-

I have some XML in a database that I am trying to index and store; I am 
interested in the various pieces of text, but none of the tags. I've been 
trying to figure out a way to strip all the tags out, but haven't found 
anything within Solr to do so; the XML parser seems to want XPath to get the 
various element values, when all I want is to turn the whole thing into one 
blob of text, regardless of whether it makes any contextual sense.

Is there something in Solr to do this, or is it something I'd have to write 
myself (which I'm willing to do if necessary)?

Thanks for any info,

Ron

DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.

Re: solr.KeepWordsFilterFactory confusion

2011-02-18 Thread Ahmet Arslan

--- On Fri, 2/18/11, Robert Haschart rh...@virginia.edu wrote:

 From: Robert Haschart rh...@virginia.edu
 Subject: Re: solr.KeepWordsFilterFactory confusion
 To: solr-user@lucene.apache.org
 Date: Friday, February 18, 2011, 10:19 PM
 Thanks for your response.  After
 making that change it seemed at first like it made no
 difference, after restarting the jetty server, and
 reindexing the test object, the display still shows:

 arr name=format_facet
   strVideo/str
   strStreaming Video/str
   strOnline/str
   strGooberhead/str
   strBook of the Month/str
 /arr

 But it turns out that I had been making an incorrect
 assumption.  I was looking at the retruned stored
 values for the solr document, and seeing the Gooberhead
 entry listed, and thinking that the analyzer wasn't
 running.  However as I have subsequently figured out,
 the analyzers are not run on the data that is to be stored,
 only on the data that is to being indexed. 
 So after making your change to that field type statement,
 if I search
 for   format_facet:Gooberhead   I
 get results = 0 which is what I'd expect.  But seeing
 that the unexpected values are still stored with the solr
 document, it seems that I will have to take a different
 approach.

Facets are populated from indexed values. However deleted documents (and their 
terms) are not really deleted until an optimize. Issuing an optimize may help 
in your case.

Re: Index Design Question

2011-02-18 Thread Andreas Kemkes

Thank you.  These are good general suggestion.

Regarding the optimization for indexing vs. querying: are there any specific 
recommendations for each of those cases available somewhere.  A link, for 
example, would be fabulous.

I'm also still curious about solutions that go further.

For example, there is a 2007 Lucene Overview presentation by Aaron Bannert 
claiming that Lucene provides built-in methods to allow queries to span 
multiple remote Lucene indexes.  and A much more involved way to achieving 
high levels of update performance can be had by dividing the data into separate 
“columns”, or “silos”. Each column will hold a subset of the overall data, and 
will only receive updates for data that it controls.  By taking advantage 
of the remote index merging query utility mentioned on an earlier slide, 
the data can still be searched in its entirety without any loss of accuracy and 
with negligible performance impact.

Is this possible using Solr?  How could this be accomplished?  Again, any link 
would be fabulous.

The wiki page http://wiki.apache.org/solr/MergingSolrIndexes seems to describe 
a 
somewhat different approach to merging.

Is this something that could be integrated into master/slave replication by 
having two masters and one merged slave (in the above sense of separate 
“columns”, or “silos”)?

If yes, what are the performance considerations when using it?

DIH threads

2011-02-18 Thread Mark

Has anyone applied the DIH threads patch on 1.4.1 
(https://issues.apache.org/jira/browse/SOLR-1352)?


Does anyone know if this works and/or does it improve performance?

Thanks

Removing duplicates

2011-02-18 Thread Mark

I know that I can use the SignatureUpdateProcessorFactory to remove 
duplicates but I would like the duplicates in the index but remove them 
conditionally at query time.


Is there any easy way I could accomplish this?

Re: solr current workding directory or reading config files


: I have a class (in a jar) that reads from properties (text) files.  I have 
these 
: files in the same jar file as the class.
: 
: However, when my class reads those properties files, those files cannot be 
found 
: since solr reads from tomcat's bin directory.

Can you elaborate a bit more on what these Jars are?  ... are these Solr 
Plugins you've writen (ie: that know about the internal Solr APIs?) ? ... 
how does your jar realted to solr?  are you building your own solr.war 
containing those jars, or are you loading it using a solr plugin lib 
directory? ... what do you mean by my class reads those properties files 
? ... what code are you using to read them?  what log/error messages are 
you getting?

: I don't really want to put the config files in tomcat's bin directory.

in an ideal world, solr would never use the current working directory, and 
would only ever pay attention to the Solr Home dir and paths things 
specificly mentioned by config directives -- but the world is not ideal, 
and solr definitely has some historic behavior that does utilize the CWD.  
But if you are using Solr's ResourceLoader API in your plugin, it should 
actively try to find your resource in a multitude of places (if it's not 
an absolute path)

need more specifics to understand exactly what is going wrong for you 
though.

-Hoss

Re: Help migrating from Lucene

: to our indexing service are defined in a central interface. Here is an
: example of a query executed from a programmatically constructed Lucene
: query.
...
: solrQuery.setQuery(query.toString());

first of all, be advised that Query.toString() is not garunteed to produce
a string that the Lucene QueryParser can parse back into a real query. If
you are programaticly building up a Lucene query just to format it back as
a string, you should probably consider just programaticly building up the
Solr query string.

Second: you should also consider the fact that there may be better ways to
express your query to solr that are more efficient, or do what you want
more then what you had before (ie: some of those MUST clauses you had
probably are ment to act as filters, which don't need to influence the
scores, and are most likely reused on many queries -- in which case
specifying them using fq instead of q is going to make things
simpler/faster and give you better relevancy scores on your real user
input.

: How can I set the sort into the java client?

Did you look at the SolrQuery.addSortField method?

: Also, with the annotations of Pojo's outlined here.
...
: How are sets handled? For instance, how are Lists of other POJO's added to
: the document?

i had no idea, but a google serach for solrj annotation beans lead me...
http://lucene.472066.n3.nabble.com/Does-SolrJ-support-nested-annotated-beans-td868375.html
...and then to...
https://issues.apache.org/jira/browse/SOLR-1945

-Hoss

Re: Dih sproc call


: References: ce2ecd6b-7a3f-4669-972d-492ab89c8...@hoplahup.net
: In-Reply-To: ce2ecd6b-7a3f-4669-972d-492ab89c8...@hoplahup.net
: Subject: Dih sproc call

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is hidden in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.



-Hoss

Re: Best way for a query-expander?


: I want to implement a query-expander, one that enriches the input by the 
: usage of extra parameters that, for example, a form may provide.
: 
: Is the right way to subclass SearchHandler?
: Or rather to subclass QueryComponent?

This smells like the poster child for an X/Y problem 
(or maybe an X/(Y OR Z) problem)...

if you can elaborate a bit more on the type of enrichment you want to do, 
it's highly likely that your goal can be met w/o needing to write a custom 
plugin (i'm thinking particularly of the multitudes of parsers solr 
already has, local params, and variable substitution)

http://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an XY Problem ... that is: you are dealing
with X, you are assuming Y will help you, and you are asking about Y
without giving more details about the X so that we can understand the
full issue.  Perhaps the best solution doesn't involve Y at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341


-Hoss

Re: DIH threads