Re: deleteById without solrj?

2009-12-04 Thread Erik Hatcher
Also note that the XML that an be POSTed to /solr/update can also be  
sent as a content stream on the URL for a plain GET request:


/solr/update?stream.body=deleteid.../id/deletecommit=true

Erik


On Dec 3, 2009, at 3:05 PM, Tom Hill wrote:


http://wiki.apache.org/solr/UpdateXmlMessages#A.22delete.22_by_ID_and_by_Query

On Thu, Dec 3, 2009 at 11:57 AM, Joel Nylund jnyl...@yahoo.com  
wrote:



Is there a url based approach to delete a document?

thanks
Joel






Debian Lenny + Apache Tomcat 5.5 + Solr 1.4

2009-12-04 Thread rajan chandi
Hi All,

We've deployed 4 instances of Solr on a debian server.

It is taking only 1.5 GB of RAM on local ubuntu machine but it is taking 2.0
GB plus on Debian Lenny server.

Any ideas/pointers will help.

Regards
Rajan


Re: Issues with alphanumeric search terms

2009-12-04 Thread AHMET ARSLAN
 I have added 
     filter
 class=solr.WordDelimiterFilterFactory catenateAll=1
 /
 to both index and query but still getting same behaviour.
 
 Is there any other that i am missing?
 

Did you re-start tomcat and re-index? Why not use StandardTokenizerFactory?





Re: creating Lucene document from an external XML file.

2009-12-04 Thread Phanindra Reva
Hello..,
  You have mentioned I can make use of UpdateProcessor API.
May I know when the flow of execution enters that
UpdateRequestProcessor class.? To be brief , it would be perfect for
my case if its after analysis but exactly before its being added to
the index.
Thanks alot.

On Wed, Dec 2, 2009 at 8:56 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : //   one possibility to think about is that instead of modifying the 
 documents
 : before sending them to Solr, you could write an UpdateProcessor tha runs
 : direclty in Solr and gets access to those Documents after Solr has already
 : parsed that XML (or even if the documents come from someplace else, like
 : DIH, or a CSV file) and then make your changes.  //
 :        I have not decided to modify documents, instead I go for
 : modifying them at run time. (modifying Java object's variables that
 : contains information extracted from the document-file).
 : my question is : Is there any part of the api which take document file
 : path as input , returns java object and gives us a way to modify
 : inbetween before sending the same object for indexing (to the
 : IndexWriter - lucene api).

 Yes ... as i mentioned the UpdateProcessor API is where you have access to
 the Documents as Lucene objects inside of Solr before they are indexed.



 -Hoss




Re: Issues with alphanumeric search terms

2009-12-04 Thread Erick Erickson
as Ahmet says, you need to re-index.

Nothing about WordDelmiterFilterFactory alters case as far as I can tell
from
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

Are you applying this in addition to the LowerCaseTokenizerFactory? In
which
case it's too late. The numbers have already been stripped...

Please get a copy of Luke and examine your index to see what actually
gets indexed, it'll give you a *much* better idea of what the various
analyzers actually put in your index.

Best
Erick

On Fri, Dec 4, 2009 at 6:57 AM, AHMET ARSLAN iori...@yahoo.com wrote:

  I have added
  filter
  class=solr.WordDelimiterFilterFactory catenateAll=1
  /
  to both index and query but still getting same behaviour.
 
  Is there any other that i am missing?
 

 Did you re-start tomcat and re-index? Why not use StandardTokenizerFactory?






how to get list of unique terms for a field

2009-12-04 Thread Joel Nylund

Hi,

lets say I have a field called countryName, is there a way to get a  
list of all the countries for this field? Trying to figure out a nice  
way to keep my categories and the solr results in sync, would be nice  
to get these from solr instead of the database.


thanks
Joel



Re: how to get list of unique terms for a field

2009-12-04 Thread Erik Hatcher


On Dec 4, 2009, at 8:59 AM, Joel Nylund wrote:
lets say I have a field called countryName, is there a way to get a  
list of all the countries for this field? Trying to figure out a  
nice way to keep my categories and the solr results in sync, would  
be nice to get these from solr instead of the database.


A couple of ways, depending on what you want:

  1) faceting, as part of the search results, filtered within  
constraints (q/fq's):  facet=onfacet.field=countryName


  2) TermsComponent: http://wiki.apache.org/solr/TermsComponent

Erik



Re: Debian Lenny + Apache Tomcat 5.5 + Solr 1.4

2009-12-04 Thread Yonik Seeley
Are you explicitly setting the heap sizes?  If not, the JVM is
deciding for itself based on what the box looks like (ram, cpus, OS,
etc).  Are they both the same architecture (32 bit or 64 bit?)

-Yonik
http://www.lucidimagination.com

p.s. in general cross-posting to both solr-user and solr-dev is discouraged.


On Fri, Dec 4, 2009 at 5:27 AM, rajan chandi chandi.ra...@gmail.com wrote:
 Hi All,

 We've deployed 4 instances of Solr on a debian server.

 It is taking only 1.5 GB of RAM on local ubuntu machine but it is taking 2.0
 GB plus on Debian Lenny server.

 Any ideas/pointers will help.

 Regards
 Rajan



Re: dismax query syntax to replace standard query

2009-12-04 Thread javaxmlsoapdev

Thanks. When I do it that way it gives me following query.

params={indent=onstart=0q=risk+testqt=dismaxfq=statusName:(Male+OR+Female)+name:Joehl=onrows=10version=2.2}
hits=63 status=0 QTime=54 

I typed in 'Risk test' (no quote in the text) in the text field in UI. I
want search to do AND between statusName and name attributes (all
attributes in fq param).  

Following is my dismax configuration in solrconfig.xml
requestHandler name=dismax class=solr.SearchHandler 
lst name=defaults
 str name=defTypedismax/str
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 str name=qf
title^2 description
 /str
 str name=pf
title description
 /str
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
 int name=ps100/int
 str name=q.alt*:*/str
 str name=hl.fltitle description/str
 str name=f.name.hl.fragsize10/str
 str name=f.name.hl.alternateFieldtitle/str
 str name=f.text.hl.fragmenterregex/str
/lst
  /requestHandler

And schema.xml has
defaultSearchFieldtitle/defaultSearchField
solrQueryParser defaultOperator=OR/ -- when I change this to AND it does
AND all params in fq and also does ANDing between words in the text field
e.g. risk+test and doesn't return me results. 

basically I want to do ORing between words in q list and ANDing between
params in fq list.

Any pointers would be appreciated.

Thanks,


isugar wrote:
 
 I believe you need to use the fq parameter with dismax (not to be confused
 with qf) to add a filter query in addition to the q parameter.
 
 So your text search value goes in q parameter (which searches on the
 fields
 you configure) and the rest of the query goes in the fq.
 
 Would that work?
 
 On Thu, Dec 3, 2009 at 7:28 PM, javaxmlsoapdev vika...@yahoo.com wrote:
 

 I have configured dismax handler to search against both title 
 description fields now I have some other attributes on the page e.g.
 status, name etc. On the search page I have three fields for user to
 input search values

 1)Free text search field (which searchs against both title 
 description)
 2)Status (multi select dropdown)
 3)name(single select dropdown)

 I want to form query like textField1:value AND status:(Male OR Female)
 AND
 name:abc. I know first (textField1:value searchs against both title 
 description as that's how I have configured dixmax in the
 configuration)
 but not sure how I can AND other attributes (in my case status 
 name)

 note; standadquery looks like following (w/o using dixmax handler)
 title:testdescription:testname:JoestatusName:(Male OR Female)
 --
 View this message in context:
 http://old.nabble.com/dismax-query-syntax-to-replace-standard-query-tp26631725p26631725.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://old.nabble.com/dismax-query-syntax-to-replace-standard-query-tp26631725p26635928.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: latency in solr response is observed after index is updated

2009-12-04 Thread Bharath Venkatesh

Hi Kay Kay ,
 We have commented out  auto commit frequency in solrconfig.xml

below is the cache configuration:-

filterCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=256/

   !-- queryResultCache caches results of searches - ordered lists of
 document ids (DocList) based on a query, a sort, and the range
 of documents requested.  --
queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=256/

  !-- documentCache caches Lucene Document objects (the stored fields for each 
document).
   Since Lucene internal document ids are transient, this cache will not be 
autowarmed.  --
documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

will further requests after index is updated  wait for auto warming to complete 
?

Thanks,
Bharath


Kay Kay wrote:
 What would be the average doc size.  What is the autoCommit frequency set in 
 solrconfig.xml .

 Another place to look at is the field cache size and the nature of warmup 
 queries run after a new searcher is created ( happens due to a commit ).



 Bharath Venkatesh wrote:
 Hi Kalidoss,
  I am not aware of using solr-config for committing the document . but I 
 have mentioned below how we update and  commit documents:
  
 curl http://solr_url/update --data-binary @feeds.xml -H 
 'Content-type:text/xml; charset=utf-8'
 curl http://solr_url/update --data-binary 'commit/' -H 
 'Content-type:text/xml; charset=utf-8'

 where feeds.xml contains the document in xml format

 we have master and slave replication for solr server.

 updates happens in master , snappuller and snapinstaller is run on slaves 
 periodically
 queries don't happen at master , only happens at slaves

 is there any thing which can be said from above information ?

 Thanks,
 Bharath



 -Original Message-
 From: kalidoss [mailto:kalidoss.muthuramalin...@sifycorp.com]
 Sent: Tue 12/1/2009 2:38 PM
 To: solr-user@lucene.apache.org
 Subject: Re: latency in solr response  is observed  after index is updated
  
 r u using solr-config for committing the document?

 bharath venkatesh wrote:
  
 Hi,

 We are observing latency (some times huge latency upto 10-20 secs) in 
 solr response  after index is updated . whats the reason of this latency 
 and how can it be minimized ?
 Note: our index size is pretty large.

 any help would be appreciated as we largely affected by it

 Thanks in advance.
 Bharath
 

This message is intended only for the use of the addressee and may contain 
information that is privileged, confidential 
and exempt from disclosure under applicable law. If the reader of this message 
is not the intended recipient, or the 
employee or agent responsible for delivering the message to the intended 
recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this e-mail 
in error, please notify us immediately by return e-mail and delete this e-mail 
and all attachments from your system.


edismax using bigrams instead of phrases?

2009-12-04 Thread Bill Dueber
I've started trying edismax, and have noticed that my relevancy ranking is
messed up with edismax because, according to the debug output, it's using
bigrams instead of phrases and inexplicably ignoring a couple of the pf
fields. While the hit count isn't changing,  this kills my ability to boost
exact title matches (or, I would guess, exact-anything-else matches, too).

debugQuery output can be seen at:

http://paste.lisp.org/display/91582

That's the exact same query except for the defType.

Note that instead of looking in the 'pf' fields for the search string gone
with the wind, it's looking individually for gone with, with the, and
the wind.

edismax is also completely ignoring the title_a and title_ab fields, which
are defined as exactmatcher as follows.

!-- Full string, stripped of \W and lowercased, for exact and left-anchored
matching --
 fieldType name=exactmatcher class=solr.TextField omitNorms=true
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=schema.UnicodeNormalizationFilterFactory
version=icu4j composed=false remove_diacritics=true
remove_modifiers=true fold=true/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory/
 filter class=solr.PatternReplaceFilterFactory
  pattern=[^\p{L}\p{N}] replacement=  replace=all
 /
   /analyzer
 /fieldType


Any help on this would be much appreciated.


-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


Re: how to get list of unique terms for a field

2009-12-04 Thread Bill Dueber
Here's a pretty simple perl script. Call it as scriptname facetindex (or
scriptname facetindex maxnum)

#

#!/usr/local/bin/perl
use strict;
use JSON::XS;
use LWP::Simple;

### CHANGE THIS TO YOUR URL!! ###

my $select = 'http://solr-vufind:8026/solr/biblio/select';


# Get facet and (optional) maxnum from the command line
my ($facet, $num) = @ARGV;
$num ||= -1; # all values

my $url =
$select?q=*:*rows=0facet=truefacet.limit=$numfacet.field=$facetwt=json
json.nl=arrarr;
my $json = decode_json(get($url));

foreach my $a (@{$json-{facet_counts}{facet_fields}{$facet}}) {
print $a-[0], \n;
}


RE: search on tomcat server

2009-12-04 Thread Jill Han
I went through all the links on http://wiki.apache.org/solr/#Search_and_Indexing
And still have no clue as how to proceed.
1. do I have to do some implementation in order to get solr to search doc. on 
tomcat server?
2. if I have files, such as .doc, docx, .pdf, .jsp, .html, etc under window xp, 
c:/tomcat/webapps/test1, /webapps/test2, 
   What should I do to make solr search those directories
3. since I am using tomcat, instead of jetty, is there any demo that shows the 
solr searching features, and real searching result?

Thanks,
Jill 


-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Monday, November 30, 2009 10:40 AM
To: solr-user@lucene.apache.org
Subject: Re: search on tomcat server

On Mon, Nov 30, 2009 at 9:55 PM, Jill Han jill@alverno.edu wrote:

 I got solr running on the tomcat server,
 http://localhost:8080/solr/admin/

 After I enter a search word, such as, solr, then hit Search button, it
 will go to

 http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in
 dent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in%0Adent=on

  and display

   ?xml version=1.0 encoding=UTF-8 ?

 -
 http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
 ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
  response

 -
 http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
 ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
lst name=responseHeader

  int name=status0/int

  int name=QTime0/int

 -
 http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
 ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
  lst name=params

str name=rows10/str

str name=start0/str

str name=indenton/str

str name=qsolr/str

str name=version2.2/str

 /lst

   /lst

result name=response numFound=0 start=0 /

  /response

  My question is what is the next step to search files on tomcat server?



Looks like you have not added any documents to Solr. See the Indexing
Documents section at http://wiki.apache.org/solr/#Search_and_Indexing

-- 
Regards,
Shalin Shekhar Mangar.


Re: edismax using bigrams instead of phrases?

2009-12-04 Thread Yonik Seeley
On Fri, Dec 4, 2009 at 11:26 AM, Bill Dueber b...@dueber.com wrote:
 I've started trying edismax, and have noticed that my relevancy ranking is
 messed up with edismax because, according to the debug output, it's using
 bigrams instead of phrases and inexplicably ignoring a couple of the pf
 fields. While the hit count isn't changing,  this kills my ability to boost
 exact title matches (or, I would guess, exact-anything-else matches, too).

It's a feature in general - the problem with putting all the terms in
a single phrase query is that you get no boosting at all if all of the
terms don't appear.

But since it may be useful as an option, perhaps we should add the
single-phrase option to extended dismax as well.

 edismax is also completely ignoring the title_a and title_ab fields, which
 are defined as exactmatcher as follows.

I believe this is because extended dismax only adds phrases for
boosting... hence if a field type outputs a single token, it's
considered redundant with the main query.  This is an optimization to
speed up queries (esp single-word queries).
Perhaps one way to fix this would be to check if the pf is in the qf
list before removing single term phrases?

-Yonik
http://www.lucidimagination.com


Re: edismax using bigrams instead of phrases?

2009-12-04 Thread Bill Dueber
I see that edismax already defines pf (bigrams) and pf3 (trigrams) -- how
would folks think about just calling them pf / pf1 (aliases for each
other?), pf2, and pf3? The pf would then behave exactly as it does in
dismax.

And it sounds like the solution to my single-token fields is to just move
them into the query itself.

Thanks!

On Fri, Dec 4, 2009 at 11:58 AM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Fri, Dec 4, 2009 at 11:26 AM, Bill Dueber b...@dueber.com wrote:
  I've started trying edismax, and have noticed that my relevancy ranking
 is
  messed up with edismax because, according to the debug output, it's using
  bigrams instead of phrases and inexplicably ignoring a couple of the pf
  fields. While the hit count isn't changing,  this kills my ability to
 boost
  exact title matches (or, I would guess, exact-anything-else matches,
 too).

 It's a feature in general - the problem with putting all the terms in
 a single phrase query is that you get no boosting at all if all of the
 terms don't appear.

 But since it may be useful as an option, perhaps we should add the
 single-phrase option to extended dismax as well.

  edismax is also completely ignoring the title_a and title_ab fields,
 which
  are defined as exactmatcher as follows.

 I believe this is because extended dismax only adds phrases for
 boosting... hence if a field type outputs a single token, it's
 considered redundant with the main query.  This is an optimization to
 speed up queries (esp single-word queries).
 Perhaps one way to fix this would be to check if the pf is in the qf
 list before removing single term phrases?

 -Yonik
 http://www.lucidimagination.com




-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


Re: question about schemas

2009-12-04 Thread solr-user


Lance Norskog-2 wrote:
 
 But, in general, this is a shopping cart database and Solr/Lucene may
 not be the best fit for this problem.
 

True, every tool has strengths and weaknesses. Given how powerful Solr
appears to be, I would be surprised if I was not able to handle this use
case.


Lance Norskog-2 wrote:
 
 You can make a separate facet field which contains a range of buckets:
 10, 20, 50, or 100 means that the field has a value 0-10, 11-20, 21-50, or
 51-100. You could use a separate filter query with values for these
 buckets. Filter queries are very fast in Solr 1.4 and this would limit
 your range query execution to documents which match the buckets.
 

Thank you for this suggestion.  I will look into this.

-- 
View this message in context: 
http://old.nabble.com/question-about-schemas-tp26600956p26636155.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: High add/delete rate and index fragmentation

2009-12-04 Thread Rodrigo De Castro
On Thu, Dec 3, 2009 at 3:59 PM, Lance Norskog goks...@gmail.com wrote:

 #2: The standard architecture is with a master that only does indexing
 and one or more slaves that only handle queries. The slaves poll the
 master for index updates regularly. Java 1.4 has a built-in system for
 this.



How do you achieve durability with the standard architecture? For one of our
use cases (which does not have much churn), we are considering this
architecture, but I don't want an update to be lost if the master goes down
before slaves update. What I was thinking initially is that this could be
achieved having a master per datacenter, which would synchronously update
other masters through a RequestHandler. So I could guarantee this
durability, but of course this architecture would have issues of its own.
like when there is a network partitioning, how you could handle master no
longer being in sync. Is there some work being done to address this use
case?



 An alternate architecture has multiple servers which do both indexing
 and queries in the same index. This provides the shortest pipeline
 time from recieving the data to making it available for search.



For our use case where there is a high add/delete rate, I was thinking of
using this architecture, as I noticed that records become available right
away. But in this case we have the concern about how well it performs when
adding/deleting. I did an initial test adding many thousands of elements and
did not see any degradation, that's why I asked about its performance when
deleting records (since it only marks for deletion and we have some control
over the automatic segment mergin, I guess this is not much of a problem).

Rodrigo



 On Wed, Dec 2, 2009 at 2:43 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
  Rodrigo,
 
  It sounds like you're asking about near realtime search support,
  I'm not sure.  So here's few ideas.
 
  #1 How often do you need to be able to search on the latest
  updates (as opposed to updates from lets say, 10 minutes ago)?
 
  To topic #2, Solr provides master slave replication. The
  optimize would happen on the master and the new index files
  replicated to the slave(s).
 
  #3 is a mixed bag at this point, and there is no official
  solution, yet. Shell scripts, and a load balancer could kind of
  work. Check out SOLR-1277 or SOLR-1395 for progress along these
  lines.
 
  Jason
  On Wed, Dec 2, 2009 at 11:53 AM, Rodrigo De Castro rodr...@sacaluta.com
 wrote:
  We are considering Solr to store events which will be added and deleted
 from
  the index in a very fast rate. Solr will be used, in this case, to find
 the
  right event we need to process (since they may have several attributes
 and
  we may search the best match based on the query attributes). Our
  understanding is that the common use cases are those wherein the read
 rate
  is much higher than writes, and deletes are not as frequent, so we are
 not
  sure Solr handles our use case very well or if it is the right fit.
 Given
  that, I have a few questions:
 
  1 - How does Solr/Lucene degrade with the fragmentation? That would
 probably
  determine the rate at which we would need to optimize the index. I
 presume
  that it depends on the rate of insertions and deletions, but would you
 have
  any benchmark on this degradation? Or, in general, how has been your
  experience with this use case?
 
  2 - Optimizing seems to be a very expensive process. While optimizing
 the
  index, how much does search performance degrade? In this case, having a
 huge
  degradation would not allow us to optimize unless we switch to another
 copy
  of the index while optimize is running.
 
  3 - In terms of high availability, what has been your experience
 detecting
  failure of master and having a slave taking over?
 
  Thanks,
  Rodrigo
 
 



 --
 Lance Norskog
 goks...@gmail.com



Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory

2009-12-04 Thread Robin Wojciki
I am running a search in Solr 1.4 and I am getting the
StringIndexOutOfBoundsException pasted below. The spell check field
uses HTMLStripCharFilterFactory. However, the search works fine if I
do not use the HTMLStripCharFilterFactory.

If I set a breakpoint at SpellCheckComponent.java: 248, the value of
the variable best is as shown in the screenshot:
http://yfrog.com/j5solrdebuginspectp

At the end of first iteration, offset = 5 - (24 - 0) = -19
This causes the index out of bounds exception.

The spell check field is defined as:

fieldType name=text_spell class=solr.TextField
positionIncrementGap=100 
analyzer
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt
enablePositionIncrements=true/
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType



Stack Trace:
=
String index out of range: -19

java.lang.StringIndexOutOfBoundsException: String index out of range: -19
at java.lang.AbstractStringBuilder.replace(Unknown Source)
at java.lang.StringBuilder.replace(Unknown Source)
at 
org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248)
at 
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


Re: question about schemas (and SOLR-1131?)

2009-12-04 Thread wojtekpia

Could this be solved with a multi-valued custom field type (including a
custom comparator)? The OP's situation deals with multi-valuing products for
each customer. If products contain strictly numeric fields then it seems
like a custom field implementation (or extension of BinaryField?) *should*
be easy - only the comparator part needs work. I'm not clear on how the
existing query parsers would handle this though, so there's probably some
work there too. SOLR-1131 seems like a more general solution that supports
analysis that numeric fields don't need.


gdeconto wrote:
 
 I saw an interesting thread in the solr-dev forum about multiple fields
 per fieldtype (https://issues.apache.org/jira/browse/SOLR-1131)
 
 from the sounds of it, it might be of interest and/or use in these types
 of problems;  for your example, you might be able to define a fieldtype
 that houses the product data.
 
 note that I only skimmed the thread. hopefully, I'll get get some time to
 look at it more closely
 

-- 
View this message in context: 
http://old.nabble.com/question-about-schemas-tp26600956p26636170.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: HTML Stripping slower in Solr 1.4?

2009-12-04 Thread Robin Wojciki
Thanks Koji for logging the ticket. I noticed its priority is set to
minor. Is there any work around? I feel like I am being half as
productive as every iteration is taking twice as much time.

Thanks
Robin

On Tue, Dec 1, 2009 at 11:47 AM, Koji Sekiguchi k...@r.email.ne.jp wrote:
 Robin,

 Thank you for reporting this. Performance degradation of HTML Stripper
 could be in 1.4. I opened a ticket in Lucene:

 https://issues.apache.org/jira/browse/LUCENE-2098

 Koji

 --
 http://www.rondhuit.com/en/




Grouping

2009-12-04 Thread Bruno
Is there a way to make a group by or distinct query?

-- 
Bruno Morelli Vargas
Mail: brun...@gmail.com
Msn: brun...@hotmail.com
Icq: 165055101
Skype: morellibmv


Re: creating Lucene document from an external XML file.

2009-12-04 Thread Otis Gospodnetic
I think you'd have to dig into Solr (Lucene actually) to inject yourself after 
Analysis.  The UpdateRequestProcessor, as the name implies, it at the request 
level, so pretty high up/early on.

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Phanindra Reva reva.phanin...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 7:48:46 AM
 Subject: Re: creating Lucene document from an external XML file.
 
 Hello..,
   You have mentioned I can make use of UpdateProcessor API.
 May I know when the flow of execution enters that
 UpdateRequestProcessor class.? To be brief , it would be perfect for
 my case if its after analysis but exactly before its being added to
 the index.
 Thanks alot.
 
 On Wed, Dec 2, 2009 at 8:56 PM, Chris Hostetter
 wrote:
 
  : //   one possibility to think about is that instead of modifying the 
 documents
  : before sending them to Solr, you could write an UpdateProcessor tha runs
  : direclty in Solr and gets access to those Documents after Solr has already
  : parsed that XML (or even if the documents come from someplace else, like
  : DIH, or a CSV file) and then make your changes.  //
  :I have not decided to modify documents, instead I go for
  : modifying them at run time. (modifying Java object's variables that
  : contains information extracted from the document-file).
  : my question is : Is there any part of the api which take document file
  : path as input , returns java object and gives us a way to modify
  : inbetween before sending the same object for indexing (to the
  : IndexWriter - lucene api).
 
  Yes ... as i mentioned the UpdateProcessor API is where you have access to
  the Documents as Lucene objects inside of Solr before they are indexed.
 
 
 
  -Hoss
 
 



Re: Grouping

2009-12-04 Thread Otis Gospodnetic
Not out of the box.  You could group by using SOLR-236 perhaps?

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Bruno brun...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 1:08:59 PM
 Subject: Grouping
 
 Is there a way to make a group by or distinct query?
 
 -- 
 Bruno Morelli Vargas
 Mail: brun...@gmail.com
 Msn: brun...@hotmail.com
 Icq: 165055101
 Skype: morellibmv



Re: High add/delete rate and index fragmentation

2009-12-04 Thread Otis Gospodnetic
Hello,

 You are right that we would need near realtime support. The problem is not
 so much about new records becoming available, but guaranteeing that deleted
 records will not be returned. For this reason, our plan would be to update
 and search a master index, provided that: (1) search while updating records
 is ok, 

It is in general, though I haven't fully tested NRT under high load.

 (2) performance is not degraded substantially due to fragmentation,

You can control that somewhat via mergeFactor.

 (3) optimization does not impact search, 

It will - disk IO, OS cache, and such will be affected, and that will affect 
search.

 and (4) we ensure durability - if a
 node goes down, an update was replicated to another node who can take over.

Maybe just index to  1 masters?  For example, another non-search tool I'm 
using (Voldemort) has the notion of required writes, which represents how 
many copies of data should be written at insert/add time.

 It seems that 1 and 2 are not so much of a problem, 3 would need to be
 tested. I would like know more about how 4 has been addressed, so we don't
 lose updates if a master goes down between updates and index replication.

Lucene buffers documents while indexing, to avoid constant disk writes.  HDD 
itself does some of that, too.  So I think you can always lose some data is 
whatever is in the buffers doesn't get flushed when somebody trips over the 
power cord in the data center.

Otis

  #3 is a mixed bag at this point, and there is no official
  solution, yet. Shell scripts, and a load balancer could kind of
  work. Check out SOLR-1277 or SOLR-1395 for progress along these
  lines.
 
 
 Thanks for the links.
 
 Rodrigo
 
 
  On Wed, Dec 2, 2009 at 11:53 AM, Rodrigo De Castro 
  wrote:
   We are considering Solr to store events which will be added and deleted
  from
   the index in a very fast rate. Solr will be used, in this case, to find
  the
   right event we need to process (since they may have several attributes
  and
   we may search the best match based on the query attributes). Our
   understanding is that the common use cases are those wherein the read
  rate
   is much higher than writes, and deletes are not as frequent, so we are
  not
   sure Solr handles our use case very well or if it is the right fit. Given
   that, I have a few questions:
  
   1 - How does Solr/Lucene degrade with the fragmentation? That would
  probably
   determine the rate at which we would need to optimize the index. I
  presume
   that it depends on the rate of insertions and deletions, but would you
  have
   any benchmark on this degradation? Or, in general, how has been your
   experience with this use case?
  
   2 - Optimizing seems to be a very expensive process. While optimizing the
   index, how much does search performance degrade? In this case, having a
  huge
   degradation would not allow us to optimize unless we switch to another
  copy
   of the index while optimize is running.
  
   3 - In terms of high availability, what has been your experience
  detecting
   failure of master and having a slave taking over?
  
   Thanks,
   Rodrigo
  
 



how to do auto-suggest case-insensitive match and return original case field values

2009-12-04 Thread hermida

Hi everyone,

New to forum and to Solr, doing my first major project with it and enjoying
it so far, great software.

In my web application I want to set up auto-suggest as you type
functionality which will search case-insensitively yet return the original
case terms.  It doesn't seem like TermsComponent can do this as it can only
return the lowercase indexed terms your are searching against, not the
original case terms.

There was one post on this forum 
http://old.nabble.com/Auto-suggest..-how-to-do-mixed-case-td2410.html#a24143981
http://old.nabble.com/Auto-suggest..-how-to-do-mixed-case-td2410.html#a24143981
 
where someone asked the same question, and what someone said is to

There is no way to do this right now using TermsComponent. You can index
lower case terms and store the mixed case terms. Then you can use a prefix
query which will return documents (and hence stored field values).

So this got me started, I set out to use Solr Query instead of
TermsComponent to try to do this.  I did the following as mentioned:

fieldType name=test class=solr.TextField positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
  /analyzer
/fieldType

fieldType name=test_lc class=solr.TextField positionIncrementGap=100
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

field name=test type=test indexed=false stored=true
multiValued=true /
field name=test_lc type=test_lc indexed=true  stored=false
multiValued=true /

And used copyField to populate the test_lc field:

copyField source=test dest=test_lc/

This is the easy part (the forum user didn't explain the hard part!) It is
very hard to get the same information that TermsComponent returns using the
regular Solr Query functionality!  For example:

http://localhost:8983/solr/terms?terms.fl=test_lcterms.prefix=aterms.sort=countterms.limit=5omitHeader=true

int name=a-kinase anchor protein 1315/int
int name=accn56/int
int name=actin-binding3/int
int name=activator1/int
int name=agie-bp11/int

which provides useful sorting by and returning of term frequency counts in
your index.  How does one get this same information with regular Solr Query? 
I set up the following prefix query, searching by the indexed lowercased
field and returning the other:

http://localhost:8983/solr/select?fl=testq=test_lc%3Aa*sort=score+descrows=5omitHeader=true

doc
  arr name=test
str3D-structure/str
stracetylation/str
stralternative promoter usage/str
strHLC-7/str
  /arr
/doc
doc
  arr name=test
stralternative splicing/str
strcomplete proteome/str
strDNA-binding/str
strRACK1/str
  /arr
/doc
doc
  arr name=test
stracetylation/str
strAIG21/str
strWD repeat/str
strGNB2L1/str
  /arr
/doc
doc
/arr
  arr name=test
str3D-structure/str
strapoptosis/str
strcathepsin G-like 1/str
strATSGL1/str
strCTLA-1/str
  /arr
/doc
doc
  arr name=test
strautoantigen Ge-1/str
strautoantigen RCD-8/str
strHERV-H LTR-associating protein 3/str
strHHLA3/str
  /arr
/doc

I can see how to process this in my front-end app to extract the original
terms starting with the prefix letter(s) used in the query, but there are
still some major problems when compared to TermsComponent:

- How do I make sure my auto-suggest list is at least a certain number of
terms long?  Using rows of course doesn't work like terms.limit, because
between returned docs there can be the same term and these will get
collapsed.
- How do I get term frequency counts like TermsComponent does?  I looked at
faceting but I don't understand how to get the TermsComponent behavior using
it.

Sorry for the long message, just wanted to fully explain, thanks for any
help!

leandro

-- 
View this message in context: 
http://old.nabble.com/how-to-do-auto-suggest-case-insensitive-match-and-return-original-case-field-values-tp26636365p26636365.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: search on tomcat server

2009-12-04 Thread William Pierce

Have you gone through the solr tomcat wiki?

http://wiki.apache.org/solr/SolrTomcat

I found this very helpful when I did our solr installation on tomcat.

- Bill

--
From: Jill Han jill@alverno.edu
Sent: Friday, December 04, 2009 8:54 AM
To: solr-user@lucene.apache.org
Subject: RE: search on tomcat server

I went through all the links on 
http://wiki.apache.org/solr/#Search_and_Indexing

And still have no clue as how to proceed.
1. do I have to do some implementation in order to get solr to search doc. 
on tomcat server?
2. if I have files, such as .doc, docx, .pdf, .jsp, .html, etc under 
window xp, c:/tomcat/webapps/test1, /webapps/test2,

  What should I do to make solr search those directories
3. since I am using tomcat, instead of jetty, is there any demo that shows 
the solr searching features, and real searching result?


Thanks,
Jill


-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: Monday, November 30, 2009 10:40 AM
To: solr-user@lucene.apache.org
Subject: Re: search on tomcat server

On Mon, Nov 30, 2009 at 9:55 PM, Jill Han jill@alverno.edu wrote:


I got solr running on the tomcat server,
http://localhost:8080/solr/admin/

After I enter a search word, such as, solr, then hit Search button, it
will go to

http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in
dent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10in%0Adent=on

 and display

  ?xml version=1.0 encoding=UTF-8 ?

-
http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
 response

-
http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
   lst name=responseHeader

 int name=status0/int

 int name=QTime0/int

-
http://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i
ndent=onhttp://localhost:8080/solr/select/?q=solrversion=2.2start=0rows=10i%0Andent=on
 lst name=params

   str name=rows10/str

   str name=start0/str

   str name=indenton/str

   str name=qsolr/str

   str name=version2.2/str

/lst

  /lst

   result name=response numFound=0 start=0 /

 /response

 My question is what is the next step to search files on tomcat server?




Looks like you have not added any documents to Solr. See the Indexing
Documents section at http://wiki.apache.org/solr/#Search_and_Indexing

--
Regards,
Shalin Shekhar Mangar.



Dumping solr requests for indexing

2009-12-04 Thread Teruhiko Kurosaka
Is there any way to dump all incoming requests to Solr
into a file?

My customer is seeing a strange problem of disappearing
docs from index and I'd like to ask them to capture all
incoming requests.

Thanks.

-kuro 


Re: Dumping solr requests for indexing

2009-12-04 Thread Otis Gospodnetic
The solr log, as well as the servlet container log should have them all.

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Teruhiko Kurosaka k...@basistech.com
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 2:23:17 PM
 Subject: Dumping solr requests for indexing
 
 Is there any way to dump all incoming requests to Solr
 into a file?
 
 My customer is seeing a strange problem of disappearing
 docs from index and I'd like to ask them to capture all
 incoming requests.
 
 Thanks.
 
 -kuro 



Best way to handle bitfields in solr...

2009-12-04 Thread William Pierce
Folks:

In my db I currently have fields that represent bitmasks.   Thus, for example, 
a value of the mask of 48 might represent an undergraduate (value = 16) and 
graduate (value = 32).   Currently,  the corresponding field in solr is a 
multi-valued string field called EdLevel which will have 
valueUndergraduate/value and valueGraduate/value  as its two values 
(for this example).   I do the conversion from the int into the list of values 
as I do the indexing.

Ideally, I'd like solr to have bitwise operations so that I could store the int 
value, and then simply search by using bit operations.  However, given that 
this is not possible,  and that there have been recent threads speaking to 
performance issues with multi-valued fields,  is there something better I could 
do?

TIA,

- Bill

Re: Best way to handle bitfields in solr...

2009-12-04 Thread Otis Gospodnetic
Would http://wiki.apache.org/solr/FunctionQuery#fieldvalue help?

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: William Pierce evalsi...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 2:43:25 PM
 Subject: Best way to handle bitfields in solr...
 
 Folks:
 
 In my db I currently have fields that represent bitmasks.   Thus, for 
 example, a 
 value of the mask of 48 might represent an undergraduate (value = 16) and 
 graduate (value = 32).   Currently,  the corresponding field in solr is a 
 multi-valued string field called EdLevel which will have 
 Undergraduate and Graduate  as its two values (for 
 this example).   I do the conversion from the int into the list of values as 
 I 
 do the indexing.
 
 Ideally, I'd like solr to have bitwise operations so that I could store the 
 int 
 value, and then simply search by using bit operations.  However, given that 
 this 
 is not possible,  and that there have been recent threads speaking to 
 performance issues with multi-valued fields,  is there something better I 
 could 
 do?
 
 TIA,
 
 - Bill



RE: Dumping solr requests for indexing

2009-12-04 Thread Teruhiko Kurosaka
Log only tells high-level descriptions of what were done.
I'd like to capture the exact XML requests with data, so that
I could re-feed it to Solr to reproduce the issue my
customer is encountering.

-kuro  

 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
 Sent: Friday, December 04, 2009 11:41 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Dumping solr requests for indexing
 
 The solr log, as well as the servlet container log should 
 have them all.
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
 
 
 
 - Original Message 
  From: Teruhiko Kurosaka k...@basistech.com
  To: solr-user@lucene.apache.org solr-user@lucene.apache.org
  Sent: Fri, December 4, 2009 2:23:17 PM
  Subject: Dumping solr requests for indexing
  
  Is there any way to dump all incoming requests to Solr into a file?
  
  My customer is seeing a strange problem of disappearing docs from 
  index and I'd like to ask them to capture all incoming requests.
  
  Thanks.
  
  -kuro
 
 

Question: Write to Solr but not via http, and still store date_format

2009-12-04 Thread Peter 4U

Hi Solr team,

 

Has anyone been able to write to Solr, keeping things like 'date_format', but 
indexing directly, rather than via http?

 

I've been indexing using Lucene Java, and this works well and is very fast, 
except that any data indexed this way doesn't store date_format et al 
information (date.format resuts always return 0).

I like indexing directly into Lucene, rather than via http requests, as it is 
much faster, particularly at very high input rates.

 

Anyone encountered this and managed to solve it?

 

Many thanks,

peter

 
  
_
Got more than one Hotmail account? Save time by linking them together
 http://clk.atdmt.com/UKM/go/186394591/direct/01/

Re: Question: Write to Solr but not via http, and still store date_format

2009-12-04 Thread Otis Gospodnetic
Are you looking for http://wiki.apache.org/solr/EmbeddedSolr ?

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Peter 4U pete...@hotmail.com
 To: Solr solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 3:09:19 PM
 Subject: Question: Write to Solr but not via http, and still store date_format
 
 
 Hi Solr team,
 
 
 
 Has anyone been able to write to Solr, keeping things like 'date_format', but 
 indexing directly, rather than via http?
 
 
 
 I've been indexing using Lucene Java, and this works well and is very fast, 
 except that any data indexed this way doesn't store date_format et al 
 information (date.format resuts always return 0).
 
 I like indexing directly into Lucene, rather than via http requests, as it is 
 much faster, particularly at very high input rates.
 
 
 
 Anyone encountered this and managed to solve it?
 
 
 
 Many thanks,
 
 peter
 
 
   
 _
 Got more than one Hotmail account? Save time by linking them together
 http://clk.atdmt.com/UKM/go/186394591/direct/01/



Answer: RE: Question: Write to Solr but not via http, and still store date_format

2009-12-04 Thread Peter 4U

Oops, of course the answer was staring me in the face!

   -- Use the EmbeddedSolrServer, rather than the CommonsHttpSolrServer.

 

Live and learn. Live. and learn.

 

Thanks,

Peter

 


 
 From: pete...@hotmail.com
 To: solr-user@lucene.apache.org
 Subject: Question: Write to Solr but not via http, and still store date_format
 Date: Fri, 4 Dec 2009 20:09:19 +
 
 
 Hi Solr team,
 
 
 
 Has anyone been able to write to Solr, keeping things like 'date_format', but 
 indexing directly, rather than via http?
 
 
 
 I've been indexing using Lucene Java, and this works well and is very fast, 
 except that any data indexed this way doesn't store date_format et al 
 information (date.format resuts always return 0).
 
 I like indexing directly into Lucene, rather than via http requests, as it is 
 much faster, particularly at very high input rates.
 
 
 
 Anyone encountered this and managed to solve it?
 
 
 
 Many thanks,
 
 peter
 
 
 
 _
 Got more than one Hotmail account? Save time by linking them together
 http://clk.atdmt.com/UKM/go/186394591/direct/01/
  
_
Got more than one Hotmail account? Save time by linking them together
 http://clk.atdmt.com/UKM/go/186394591/direct/01/

Re: Dumping solr requests for indexing

2009-12-04 Thread Otis Gospodnetic
Aha!
Sounds like a job for a simple, custom UpdateRequestProcessor.  Actually, I 
think URP doesn't get access to the actual XML, but what it has access may be 
enough for you: http://wiki.apache.org/solr/UpdateRequestProcessor

Alternatively, unpack the war, add a custom logging servlet filter, chain it in 
web.xml and that might do the trick.

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Teruhiko Kurosaka k...@basistech.com
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 3:05:57 PM
 Subject: RE: Dumping solr requests for indexing
 
 Log only tells high-level descriptions of what were done.
 I'd like to capture the exact XML requests with data, so that
 I could re-feed it to Solr to reproduce the issue my
 customer is encountering.
 
 -kuro  
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
  Sent: Friday, December 04, 2009 11:41 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Dumping solr requests for indexing
  
  The solr log, as well as the servlet container log should 
  have them all.
  
  Otis
  --
  Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
  
  
  
  - Original Message 
   From: Teruhiko Kurosaka 
   To: solr-user@lucene.apache.org 
   Sent: Fri, December 4, 2009 2:23:17 PM
   Subject: Dumping solr requests for indexing
   
   Is there any way to dump all incoming requests to Solr into a file?
   
   My customer is seeing a strange problem of disappearing docs from 
   index and I'd like to ask them to capture all incoming requests.
   
   Thanks.
   
   -kuro
  
  



how to set multiple fq while building a query in solrj

2009-12-04 Thread javaxmlsoapdev

how do I create a query string witih multiple fq params using solrj SolrQuery
API.

e.g. I want to build a query as follow

http://servername:port/solr/issues/select/?q=testingfq=statusName:(Female
OR Male)fq=name=Joe

I am using solrj client APIs to build query and using SolrQuery as follow

solrQuery.setParam(fq statusString);
solrQuery.setParam(fq, nameString);

It only sets last fq (fq=nameString)in the string.. If I swich abover
setParam order it sets fq=statusString. How do I set muliple fq params in
SolrQuery object.

Thanks,
-- 
View this message in context: 
http://old.nabble.com/how-to-set-multiple-fq-while-building-a-query-in-solrj-tp26638650p26638650.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: how is score computed with hsin functionquery?

2009-12-04 Thread gdeconto

Thanks Lance, I appreciate your response.  

I know what a DIH is and have already written custom transformers.  I just
misunderstood your response to my message (I wasnt aware that we could use
JS to create transformers).

Anyhow, my intent is to change the tool (create a variation of hsin to
support degrees) rather than change the data (which introduces other issues,
such as having to support most systems in degrees and this one system in
radians)

any ideas/advice in that regard?
-- 
View this message in context: 
http://old.nabble.com/how-is-score-computed-with-hsin-functionquery--tp26504265p26638720.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: HTML Stripping slower in Solr 1.4?

2009-12-04 Thread Yonik Seeley
Is BaseCharFilter required for the html strip filter?

-Yonik
http://www.lucidimagination.com


On Tue, Dec 1, 2009 at 1:17 AM, Koji Sekiguchi k...@r.email.ne.jp wrote:
 Robin,

 Thank you for reporting this. Performance degradation of HTML Stripper
 could be in 1.4. I opened a ticket in Lucene:

 https://issues.apache.org/jira/browse/LUCENE-2098

 Koji

 --
 http://www.rondhuit.com/en/




Query time boosting with dismax

2009-12-04 Thread Girish Redekar
Hi,

Is it possible to weigh specific query terms with a Dismax query parser? Is
it possible to write queries of the sort ...
field1:(term1)^2.0 + (term2^3.0)
with dismax?

Thanks,
Girish Redekar
http://girishredekar.net


Re: Debian Lenny + Apache Tomcat 5.5 + Solr 1.4

2009-12-04 Thread rajan chandi
We are using 64 bit VM with 64 bit JDK on it.
It is 2.00 GB RAM Zen instance.

We're setting up max JVM heap size of 1800 MB max.

- Rajan


On Fri, Dec 4, 2009 at 8:19 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Are you explicitly setting the heap sizes?  If not, the JVM is
 deciding for itself based on what the box looks like (ram, cpus, OS,
 etc).  Are they both the same architecture (32 bit or 64 bit?)

 -Yonik
 http://www.lucidimagination.com

 p.s. in general cross-posting to both solr-user and solr-dev is
 discouraged.


 On Fri, Dec 4, 2009 at 5:27 AM, rajan chandi chandi.ra...@gmail.com
 wrote:
  Hi All,
 
  We've deployed 4 instances of Solr on a debian server.
 
  It is taking only 1.5 GB of RAM on local ubuntu machine but it is taking
 2.0
  GB plus on Debian Lenny server.
 
  Any ideas/pointers will help.
 
  Regards
  Rajan
 



Re: Debian Lenny + Apache Tomcat 5.5 + Solr 1.4

2009-12-04 Thread rajan chandi
My local ubuntu 9.04 64 bit taking 1.5 GB is not a VM and Debian Lenny 64
bit taking 2 GB is a Xen Instance.

- Rajan

On Sat, Dec 5, 2009 at 10:51 AM, rajan chandi chandi.ra...@gmail.comwrote:

 We are using 64 bit VM with 64 bit JDK on it.
 It is 2.00 GB RAM Zen instance.

 We're setting up max JVM heap size of 1800 MB max.

 - Rajan



 On Fri, Dec 4, 2009 at 8:19 PM, Yonik Seeley 
 yo...@lucidimagination.comwrote:

 Are you explicitly setting the heap sizes?  If not, the JVM is
 deciding for itself based on what the box looks like (ram, cpus, OS,
 etc).  Are they both the same architecture (32 bit or 64 bit?)

 -Yonik
 http://www.lucidimagination.com

 p.s. in general cross-posting to both solr-user and solr-dev is
 discouraged.


 On Fri, Dec 4, 2009 at 5:27 AM, rajan chandi chandi.ra...@gmail.com
 wrote:
  Hi All,
 
  We've deployed 4 instances of Solr on a debian server.
 
  It is taking only 1.5 GB of RAM on local ubuntu machine but it is taking
 2.0
  GB plus on Debian Lenny server.
 
  Any ideas/pointers will help.
 
  Regards
  Rajan
 





Re: Debian Lenny + Apache Tomcat 5.5 + Solr 1.4

2009-12-04 Thread rajan chandi
Local Solr doesn't look like 64 bit.

ra...@rajan-desktop:~$ uname -a
Linux rajan-desktop 2.6.28-16-server #55-Ubuntu SMP Tue Oct 20 20:50:00 UTC
2009 i686 GNU/Linux


But the Xen Solr server does

ra...@rajan-desktop:~$ uname -a
Linux rajan-desktop 2.6.28-16-server #55-Ubuntu SMP Tue Oct 20 20:50:00 UTC
2009 i686 GNU/Linux


May be that is the reason why Server is taking more RAM.

Thanks all for your responses.

Regards
Rajan

On Sat, Dec 5, 2009 at 11:06 AM, rajan chandi chandi.ra...@gmail.comwrote:

 My local ubuntu 9.04 64 bit taking 1.5 GB is not a VM and Debian Lenny 64
 bit taking 2 GB is a Xen Instance.

 - Rajan


 On Sat, Dec 5, 2009 at 10:51 AM, rajan chandi chandi.ra...@gmail.comwrote:

 We are using 64 bit VM with 64 bit JDK on it.
 It is 2.00 GB RAM Zen instance.

 We're setting up max JVM heap size of 1800 MB max.

 - Rajan



 On Fri, Dec 4, 2009 at 8:19 PM, Yonik Seeley 
 yo...@lucidimagination.comwrote:

 Are you explicitly setting the heap sizes?  If not, the JVM is
 deciding for itself based on what the box looks like (ram, cpus, OS,
 etc).  Are they both the same architecture (32 bit or 64 bit?)

 -Yonik
 http://www.lucidimagination.com

 p.s. in general cross-posting to both solr-user and solr-dev is
 discouraged.


 On Fri, Dec 4, 2009 at 5:27 AM, rajan chandi chandi.ra...@gmail.com
 wrote:
  Hi All,
 
  We've deployed 4 instances of Solr on a debian server.
 
  It is taking only 1.5 GB of RAM on local ubuntu machine but it is
 taking 2.0
  GB plus on Debian Lenny server.
 
  Any ideas/pointers will help.
 
  Regards
  Rajan
 






Re: Query time boosting with dismax

2009-12-04 Thread Otis Gospodnetic
Terms no, but fields (with terms) and phrases, yes.


Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Girish Redekar girish.rede...@aplopio.com
 To: solr-user@lucene.apache.org
 Sent: Fri, December 4, 2009 11:42:16 PM
 Subject: Query time boosting with dismax
 
 Hi,
 
 Is it possible to weigh specific query terms with a Dismax query parser? Is
 it possible to write queries of the sort ...
 field1:(term1)^2.0 + (term2^3.0)
 with dismax?
 
 Thanks,
 Girish Redekar
 http://girishredekar.net