date:20090508

Wildcard Search

2009-05-08 Thread dabboo


Hi,

I am facing a n wierd issue while searching.

I am searching for word *sytem*, it displays all the records which contains
system, systems etc. But when I tried to search *systems*, it only returns
me those records, which have systems-, systems/ etc etc. It is considering
wildcard as 1 or more character and not zero character.

So, it is not returning records which has systems has one word. Is there any
way to resolve this.

Please suggest.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is it possible to writing solr result on disk from the server side?

2009-05-08 Thread arno13


Tanks Paul!
Yes, I saw it. In fact, if I well understand, it could be a solution, but a
little beat too complicate for what I want to do. Currently my client part
is not in Java and I still need a kind of client and server model because
it's a Web application and solr have to be running and waiting query
continuously.
So even if it seems possible with EmbeddedSolrServer, sharing the xml
results on a file will be a faster solution for me regarding also the
development time.



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 did you consider using an EmbeddedSolrServer?
 
 On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org
 wrote:

 Do you know if it's possible to writing solr results directly on a hard
 disk
 from server side and not to use an HTTP connection to transfer the
 results?

 While the query time is very fast for solr, I want to do that, cause of
 the
 time taken during the transfer of the results between the client and the
 solr server when you have lot of 'rows'.
 For instance for 10'000 rows, the query time could be 50 ms and 19s to
 get
 the results from the server. As my client and server are on the same
 system,
 I could get the results faster directly on the hard disk (or better in a
 ram
 disk), is it possible configuring solr for that?

 Regards,




 --
 View this message in context:
 http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html
Sent from the Solr - User mailing list archive at Nabble.com.

Organizing multiple searchers around overlapping subsets of data

2009-05-08 Thread Michael Ludwig


I have one type of document, but different searchers, each of
which is interested in a different subset of the documents,
which are different configurations of TV channels {A,B,C,D}.

* Application S1 is interested in all channels, i.e. {A,B,C,D}.
* Application S2 is interested in {A,B,C}.
* Application S3 is interested in {A,C,D}.
* Application S4 is interested in {B,D}.

As can be seen from this simplified example, the subsets are
not disjoint, but do have considerable overlaps.

The total data volume is only about 200 MB. There are four
searchers, and they may become ten or a dozen.

The set elements an application may or may not be interested
in, however, i.e. the channels, which are {A,B,C,D} in this
example, are not just four, but about 150, each of which has
about 1000 documents.

What is the best way to organize this?

(a) Set up different cores for each application, i.e. going
multi-core, thereby incurring a good deal of redundancy, but
simplifying searches?

(b) Apply filter queries to select documents from only, say
60, 80 or 110 out of 150 channels.

(c) Something else I'm not aware of.

Am I right in suspecting that multi-core makes less sense with
increasing overlaps and hence redundancy?

Michael Ludwig

Re: StatsComponent and 1.3

2009-05-08 Thread Eric Pugh

I'm guessing that manipulating the client end, acts_as_solr, is an  
easier approach then backporting server side functionality.   
Especially as you will have to forward migrate at some point.


Out of curiosity, which version of acts_as_solr are you using?  The  
plugin has moved homes a couple of times, and I have heard and found  
that the version by Mathias Meyer at http://github.com/mattmatt/acts_as_solr/tree/master 
 is the best.  I've used it with 1.4 trunk with no issues, and  
Mathias has been very responsive.


Eric


On May 7, 2009, at 10:25 PM, David Shettler wrote:


Foreword:  I'm not a java developer :)

OSVDB.org and datalossdb.org make use of solr pretty extensively via
acts_as_solr.

I found myself with a real need for some of the StatsComponent stuff
(mainly the sum feature), so I pulled down a nightly build and played
with it.  StatsComponent proved perfect, but... the nightly build
output seems to be different, and thus incompatible with acts_as_solr.

Now, I realize this is more or less an acts_as_solr issue, but...

Is it possible, with some degree of effort (obviously) for me to
essentially port some of the functionality of StatsComponent to 1.3
myself?  It's that, or waiting for 1.4 to come out and someone
developing support for it into acts_as_solr, or myself fixing what I
have for acts_as_solr to work with the output.  I'm just trying to
gauge the easiest solution :)

Any feedback or suggestions would be grand.

Thanks,

Dave
Open Security Foundation


-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal

Re: Core Reload issue

2009-05-08 Thread Noble Paul നോബിള്‍ नोब्ळ्

As long as your indexed documents have the stop words you will
continue to see the stop words in the results.

On Fri, May 8, 2009 at 11:24 AM, Sagar Khetkade
sagar.khetk...@hotmail.com wrote:



 From my understanding re-indexing the documents is a different thing. If you 
 have the stop word filter for field type say text then after reloading the 
 core if i type in a query which is stop word only it would get parsed from 
 stop word filter which would eventually will not serach against the index.

 But in my case i am getting the results having the stop word; so the issue.



 ~Sagar




 From: noble.p...@gmail.com
 Date: Tue, 5 May 2009 10:09:29 +0530
 Subject: Re: Core Reload issue
 To: solr-user@lucene.apache.org

 If you change the the conf files and if you reindex the documents it
 must be reflected are you sure you re-indexed?

 On Tue, May 5, 2009 at 10:00 AM, Sagar Khetkade
 sagar.khetk...@hotmail.com wrote:
 
  Hi,
 
  I came across a strange problem while reloading the core in multicore 
  scenario. In the config of one of the core I am making changes in the 
  synonym and stopword files and then reloading the core. The core gets 
  reloaded but the changes in stopword and synonym fiels does not get 
  reflected when I query in. The filters for index and query are the same. I 
  face this problem even if I reindex the documents. But when I restart the 
  servlet container in which the solr is embedded I problem does not 
  resurfaces.
  My ultimate goal is/was to reflect the changes made in the text files 
  inside the config folder.
  Is this the expected behaviour or some problem at my side. Could anyone 
  suggest me the possible work around?
 
  Thanks in advance!
 
  Regards,
  Sagar Khetkade
  _
  More than messages–check out the rest of the Windows Live™.
  http://www.microsoft.com/india/windows/windowslive/



 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com

 _
 Planning the weekend ? Here’s what is happening in your town.
 http://msn.asklaila.com/events/



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: Is it possible to writing solr result on disk from the server side?

2009-05-08 Thread Noble Paul നോബിള്‍ नोब्ळ्

Solr does not need any special configuration to do this. Just fire
your query once and save the results xml/json into a file or in
memory. When you need them again , just read it from disk.memory

On Fri, May 8, 2009 at 1:21 PM, arno13 arnaud.gaudi...@healthonnet.org wrote:

 Tanks Paul!
 Yes, I saw it. In fact, if I well understand, it could be a solution, but a
 little beat too complicate for what I want to do. Currently my client part
 is not in Java and I still need a kind of client and server model because
 it's a Web application and solr have to be running and waiting query
 continuously.
 So even if it seems possible with EmbeddedSolrServer, sharing the xml
 results on a file will be a faster solution for me regarding also the
 development time.



 Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:

 did you consider using an EmbeddedSolrServer?

 On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org
 wrote:

 Do you know if it's possible to writing solr results directly on a hard
 disk
 from server side and not to use an HTTP connection to transfer the
 results?

 While the query time is very fast for solr, I want to do that, cause of
 the
 time taken during the transfer of the results between the client and the
 solr server when you have lot of 'rows'.
 For instance for 10'000 rows, the query time could be 50 ms and 19s to
 get
 the results from the server. As my client and server are on the same
 system,
 I could get the results faster directly on the hard disk (or better in a
 ram
 disk), is it possible configuring solr for that?

 Regards,




 --
 View this message in context:
 http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com



 --
 View this message in context: 
 http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: What are the Unicode encodings supported by Solr?

2009-05-08 Thread Michael Ludwig


KK schrieb:


I'd like to know about the different Unicode[/any other?] encodings
supported by Solr for posting docs [thru Solrj in my case]. Is it that
just UTF-8, UCN  supported or other character encodings like
NCR(decimal), NCR(hex) etc are supported as well?


Any numerical character reference (NCR), decimal or hexadecimal, is
valid UTF-8 as long as it maps to a valid Unicode character.


I found that for most of the pages the encoding is UTF-8[in this case
searching works fine] but for others the encoding is some other
character encoding[like NCR(dec), NCR(hex) or might be something else,
don't have much idea on this].


Whatever the encoding is, your application needs to know what it is when
dealing with bytes read from the network.


So when I fetch the page content thru java methods using
InputSteamReaders and after stripping various tags what I obtained
is raw text with some encoding not getting supported by Solr.


Did you make sure to not rely on your platform default encoding
(Charset) when constructing the InputStreamReader? If in doubt, take
a look at the InputStreamReader constructors.

Michael Ludwig

Re: Solr spring application context error

2009-05-08 Thread Raju444us


Paul/Erik,
  
   Thanks for your reply.I have the jar file containing the plugin code and
the applicationContext.xml in solr/home/lib directory.It is instantiating
the plugin code.But it is not loading the application context.xml file from
solr/home/lib dir.But when i copied the jar file containing the
applicationContext.xml file into the solr.war file's WEB-Inf/lib dir and
placed the solr.war file in tomcat's web-apps dir ,it worked.

As Erik said solr may only load the xml from solr.war file?

Please let me know if there is any way to do this by placing the
applicationContext.xml file in solr/home/lib.

Thanks,
Raju

   

Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 a point to keep in mind is that all the plugin code and everything
 else must be put into the solrhome/lib directory.
 
 where have you placed the file com/mypackage/applicationContext.xml ?
 
 On Fri, May 8, 2009 at 12:19 AM, Raju444us gudipal...@gmail.com wrote:

 I have configured solr using tomcat.Everything works fine.I overrode
 QParserPlugin and configured it.The overriden QParserPlugin has a
 dependency
 on another project say project1.So I made a jar of the project and copied
 the jar to the solr/home lib dir.

 the project1 project is using spring.It has a factory class which loads
 the
 beans.Iam using this factory calss in QParserPlugin to get a bean.When I
 start my tomcat the factory class is loading fine.But the problem is its
 not
 loading the beans.And Iam getting exception

 org.springframework.beans.factory.BeanDefinitionStoreException:
 IOException
 parsing XML document from class path resource
 [com/mypackage/applicationContext.xml]; nested exception is
 java.io.FileNotFoundException: class path resource
 [com/mypackage/applicationContext.xml] cannot be opened because it does
 not
 exist

 Do I need to do something else?. Can anybody please help me.

 Thanks,
 Raju


 --
 View this message in context:
 http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23444847.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr spring application context error

2009-05-08 Thread Mark Miller

I've run into this in the past as well. Its fairly annoying. Anyone know 
why the limitation? Why aren't we passing the ClassLoader thats loading 
Solr classes as the parent to the lib dir plugin classloader?


- Mark

Erik Hatcher wrote:
This is probably because Solr loads its extensions from a custom class 
loader, but if that class then needs to access things from the 
classpath, it is only going to see the built-in WEB-INF/lib classes, 
not solr/home lib JAR files.  Maybe there is a Spring way to point it 
at that lib directory also?  This is the kinda pain we get, it seems, 
when reinventing a container, unfortunately.


Erik

On May 7, 2009, at 2:49 PM, Raju444us wrote:



I have configured solr using tomcat.Everything works fine.I overrode
QParserPlugin and configured it.The overriden QParserPlugin has a 
dependency
on another project say project1.So I made a jar of the project and 
copied

the jar to the solr/home lib dir.

the project1 project is using spring.It has a factory class which 
loads the

beans.Iam using this factory calss in QParserPlugin to get a bean.When I
start my tomcat the factory class is loading fine.But the problem is 
its not

loading the beans.And Iam getting exception

org.springframework.beans.factory.BeanDefinitionStoreException: 
IOException

parsing XML document from class path resource
[com/mypackage/applicationContext.xml]; nested exception is
java.io.FileNotFoundException: class path resource
[com/mypackage/applicationContext.xml] cannot be opened because it 
does not

exist

Do I need to do something else?. Can anybody please help me.

Thanks,
Raju


--
View this message in context: 
http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html 


Sent from the Solr - User mailing list archive at Nabble.com.





--
- Mark

http://www.lucidimagination.com

Re: Is it possible to writing solr result on disk from the server side?

2009-05-08 Thread arno13


It's what I do from the client side however I don't know how to do this from
the server side (solr).
Sorry if I wasn't clear enough.



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 Solr does not need any special configuration to do this. Just fire
 your query once and save the results xml/json into a file or in
 memory. When you need them again , just read it from disk.memory
 
 On Fri, May 8, 2009 at 1:21 PM, arno13 arnaud.gaudi...@healthonnet.org
 wrote:

 Thanks Paul!
 Yes, I saw it. In fact, if I well understand, it could be a solution, but
 a
 little beat too complicate for what I want to do. Currently my client
 part
 is not in Java and I still need a kind of client and server model because
 it's a Web application and solr have to be running and waiting query
 continuously.
 So even if it seems possible with EmbeddedSolrServer, sharing the xml
 results on a file will be a faster solution for me regarding also the
 development time.



 Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:

 did you consider using an EmbeddedSolrServer?

 On Thu, May 7, 2009 at 8:25 PM, arno13 arnaud.gaudi...@healthonnet.org
 wrote:

 Do you know if it's possible to writing solr results directly on a hard
 disk
 from server side and not to use an HTTP connection to transfer the
 results?

 While the query time is very fast for solr, I want to do that, cause of
 the
 time taken during the transfer of the results between the client and
 the
 solr server when you have lot of 'rows'.
 For instance for 10'000 rows, the query time could be 50 ms and 19s to
 get
 the results from the server. As my client and server are on the same
 system,
 I could get the results faster directly on the hard disk (or better in
 a
 ram
 disk), is it possible configuring solr for that?

 Regards,




 --
 View this message in context:
 http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23428509.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com



 --
 View this message in context:
 http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23441651.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Is-it-possible-to-writing-solr-result-on-disk-from-the-server-side--tp23428509p23445157.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search

2009-05-08 Thread Erick Erickson

Are you by any chance stemming the field when you index?

Erick

On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:


 Hi,

 I am facing a n wierd issue while searching.

 I am searching for word *sytem*, it displays all the records which contains
 system, systems etc. But when I tried to search *systems*, it only returns
 me those records, which have systems-, systems/ etc etc. It is considering
 wildcard as 1 or more character and not zero character.

 So, it is not returning records which has systems has one word. Is there
 any
 way to resolve this.

 Please suggest.

 Thanks,
 Amit Garg
 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search

2009-05-08 Thread dabboo


Yes, thats correct. I have applied EnglishPorterFilterFactory at index time
as well. Do you think, I should remove it and do the indexing again.


Erick Erickson wrote:
 
 Are you by any chance stemming the field when you index?
 
 Erick
 
 On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:
 

 Hi,

 I am facing a n wierd issue while searching.

 I am searching for word *sytem*, it displays all the records which
 contains
 system, systems etc. But when I tried to search *systems*, it only
 returns
 me those records, which have systems-, systems/ etc etc. It is
 considering
 wildcard as 1 or more character and not zero character.

 So, it is not returning records which has systems has one word. Is there
 any
 way to resolve this.

 Please suggest.

 Thanks,
 Amit Garg
 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard Search

2009-05-08 Thread Erick Erickson

My *guess* is that what you're seeing is that wildcard searches are
not analyzed, in this case not run through the stemmer. So your
index only contains system and the funky variants (e.g. systems/).
I don't really understand why you'd get systems/ in your index, but
I'm assuming that your filter chain doesn't remove things like slashes.

So, you have system and systems/ in your index, but not
systems due to stemming, so searching for systems*
translates into systems OR systems/ OR and since
no documents have systems, you don't get them as hits.

All that said, you need to revisit your indexing parameters to make
what happens fit your expectations. I'd advise getting a copy of Luke
and pointing it at your index in order to see what *really* gets put in
it.

Best
Erick



You need to either introduce filters that remove odd stuff like slashes

On Fri, May 8, 2009 at 9:25 AM, dabboo ag...@sapient.com wrote:


 Yes, thats correct. I have applied EnglishPorterFilterFactory at index time
 as well. Do you think, I should remove it and do the indexing again.


 Erick Erickson wrote:
 
  Are you by any chance stemming the field when you index?
 
  Erick
 
  On Fri, May 8, 2009 at 2:29 AM, dabboo ag...@sapient.com wrote:
 
 
  Hi,
 
  I am facing a n wierd issue while searching.
 
  I am searching for word *sytem*, it displays all the records which
  contains
  system, systems etc. But when I tried to search *systems*, it only
  returns
  me those records, which have systems-, systems/ etc etc. It is
  considering
  wildcard as 1 or more character and not zero character.
 
  So, it is not returning records which has systems has one word. Is there
  any
  way to resolve this.
 
  Please suggest.
 
  Thanks,
  Amit Garg
  --
  View this message in context:
  http://www.nabble.com/Wildcard-Search-tp23440795p23440795.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Wildcard-Search-tp23440795p23445966.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr spring application context error

2009-05-08 Thread sachin78


Iam having the same problem.Please let me know if anyone finds answer to
this.

Thank you,
Sachin


markrmiller wrote:
 
 I've run into this in the past as well. Its fairly annoying. Anyone know 
 why the limitation? Why aren't we passing the ClassLoader thats loading 
 Solr classes as the parent to the lib dir plugin classloader?
 
 - Mark
 
 Erik Hatcher wrote:
 This is probably because Solr loads its extensions from a custom class 
 loader, but if that class then needs to access things from the 
 classpath, it is only going to see the built-in WEB-INF/lib classes, 
 not solr/home lib JAR files.  Maybe there is a Spring way to point it 
 at that lib directory also?  This is the kinda pain we get, it seems, 
 when reinventing a container, unfortunately.

 Erik

 On May 7, 2009, at 2:49 PM, Raju444us wrote:


 I have configured solr using tomcat.Everything works fine.I overrode
 QParserPlugin and configured it.The overriden QParserPlugin has a 
 dependency
 on another project say project1.So I made a jar of the project and 
 copied
 the jar to the solr/home lib dir.

 the project1 project is using spring.It has a factory class which 
 loads the
 beans.Iam using this factory calss in QParserPlugin to get a bean.When I
 start my tomcat the factory class is loading fine.But the problem is 
 its not
 loading the beans.And Iam getting exception

 org.springframework.beans.factory.BeanDefinitionStoreException: 
 IOException
 parsing XML document from class path resource
 [com/mypackage/applicationContext.xml]; nested exception is
 java.io.FileNotFoundException: class path resource
 [com/mypackage/applicationContext.xml] cannot be opened because it 
 does not
 exist

 Do I need to do something else?. Can anybody please help me.

 Thanks,
 Raju


 -- 
 View this message in context: 
 http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23432901.html
  

 Sent from the Solr - User mailing list archive at Nabble.com.

 
 
 -- 
 - Mark
 
 http://www.lucidimagination.com
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Solr-spring-application-context-error-tp23432901p23446518.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: bug? No highlighting results with dismax and q.alt=:

2009-05-08 Thread Peter Wolanin

Possibly this issue is related: https://issues.apache.org/jira/browse/SOLR-825

Though it seems that might affect the standard handler, while what I'm
seeing is more sepcific to the dismax handler.

-Peter

On Thu, May 7, 2009 at 8:27 PM, Peter Wolanin peter.wola...@acquia.com wrote:
For the Drupal Apache Solr Integration module, we are exploring the
possibility of doing facet browsing - since we are using dismax as
the default handler, this would mean issuing a query with an empty q
and falling back to to q.alt='*:*' or some other q.alt that matches
all docs.

However, I notice when I do this that we do not get any highlights
back in the results despite defining a highlight alternate field.

In contrast, if I force the standard request handler then I do get
text back from the highlight alternate field:

select/?q=*:*qt=standardhl=truehl.fl=bodyhl.alternateField=bodyhl.maxAlternateFieldLength=256

However, I then loose the nice dismax features of weighting the
results using bq and bf parameters. So, is this a bug or the intended
behavior?

The relevant fragment of the solrconfig.xml is this:

requestHandler name=partitioned class=solr.SearchHandler default=true
lst name=defaults
str name=defTypedismax/str

str name=q.alt*:*/str

!-- example highlighter config, enable per-query with hl=true --
str name=hltrue/str
str name=hl.flbody/str
int name=hl.snippets3/int
str name=hl.mergeContiguoustrue/str
!-- instructs Solr to return the field itself if no query terms are
found --
str name=f.body.hl.alternateFieldbody/str
str name=f.body.hl.maxAlternateFieldLength256/str

Full solrconfig.xml and other files:
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/apachesolr/?pathrev=DRUPAL-6--1

--
Peter M. Wolanin, Ph.D.
Momentum Specialist, Acquia. Inc.
peter.wola...@acquia.com

Re: Backups using Java-based Replication (forced snapshot)

2009-05-08 Thread pjaol


Thinking the same last week, as I was tailoring the snapshooter.sh script.
The data directory should be kept for the temp snapshot, as a way to ensure
linking is occurring on the same device.
snapshooter.sh
 87 name=${data_dir}/${snap_name}
I think only this needs to be configurable for the final move.


Grant Ingersoll-6 wrote:
 
 On the page http://wiki.apache.org/solr/SolrReplication, it says the  
 following:
 Force a snapshot on master.This is useful to take periodic  
 backups .command :  http://master_host:port/solr/replication? 
 command=snapshoot
 
 This then puts the snapshot under the data directory.  Perfectly  
 reasonable thing to do.  However, is it possible to have it take in a  
 directory location and store the snapshot there?  For instance, I may  
 want to have it write to a specific directory that is being watched  
 for backup data.
 
 Thanks,
 Grant
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Backups-using-Java-based-Replication-%28forced-snapshot%29-tp23434396p23447792.html
Sent from the Solr - User mailing list archive at Nabble.com.

solr + wordpress

2009-05-08 Thread Noble Paul നോബിള്‍ नोब्ळ्

Somebody has writte an articles on integrating Solr with wordpress

http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/

-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: bug? No highlighting results with dismax and q.alt=:

2009-05-08 Thread Marc Sturlese

I have experienced it before... maybe you can manage something similar to
your q.alt using the params q and qf. Highlight will work in that case (I
sorted it out doing that)

Peter Wolanin-2 wrote:

Possibly this issue is related:
https://issues.apache.org/jira/browse/SOLR-825

Though it seems that might affect the standard handler, while what I'm
seeing is more sepcific to the dismax handler.

-Peter

On Thu, May 7, 2009 at 8:27 PM, Peter Wolanin peter.wola...@acquia.com
wrote:
For the Drupal Apache Solr Integration module, we are exploring the
possibility of doing facet browsing - since we are using dismax as
the default handler, this would mean issuing a query with an empty q
and falling back to to q.alt='*:*' or some other q.alt that matches
all docs.

However, I notice when I do this that we do not get any highlights
back in the results despite defining a highlight alternate field.

In contrast, if I force the standard request handler then I do get
text back from the highlight alternate field:

select/?q=*:*qt=standardhl=truehl.fl=bodyhl.alternateField=bodyhl.maxAlternateFieldLength=256

However, I then loose the nice dismax features of weighting the
results using bq and bf parameters. So, is this a bug or the intended
behavior?

The relevant fragment of the solrconfig.xml is this:

requestHandler name=partitioned class=solr.SearchHandler
default=true
lst name=defaults
str name=defTypedismax/str

str name=q.alt*:*/str

Full solrconfig.xml and other files:
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/apachesolr/?pathrev=DRUPAL-6--1

--
Peter M. Wolanin, Ph.D.
Momentum Specialist, Acquia. Inc.
peter.wola...@acquia.com

--
View this message in context:
http://www.nabble.com/bug--No-highlighting-results-with-dismax-and-q.alt%3D*%3A*-tp23438048p23450189.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: preImportDeleteQuery

2009-05-08 Thread wojtekpia


I'm using full-import, not delta-import. I tried it with delta-import, and it
would work, except that I'm querying for a large number of documents so I
can't afford the cost of deltaImportQuery for each document.

It sounds like $deleteDocId will work. I just need to update from 1.3 to
trunk. Thanks!


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
 
 are you doing a full-import or a delta-import?
 
 for delta-import there is an option of deletedPkQuery which should
 meet your needs
 
 

-- 
View this message in context: 
http://www.nabble.com/preImportDeleteQuery-tp23437674p23450308.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr + wordpress

2009-05-08 Thread Matt Weber


I actually wrote a plugin that integrates Solr with WordPress.

http://www.mattweber.org/2009/04/21/solr-for-wordpress/
http://wordpress.org/extend/plugins/solr-for-wordpress/
https://launchpad.net/solr4wordpress

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 8, 2009, at 10:10 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



Somebody has writte an articles on integrating Solr with wordpress

http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/

--
-
Noble Paul | Principal Engineer| AOL | http://aol.com

how to pronounce solr

2009-05-08 Thread Charles Federspiel

Hi,
My company is evaluating different open-source indexing and search software
and we are seriously considering Solr.
One of my collegues pronounces it differently than I do and I have no basis
of correcting him.
Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like,
SolAhhRrr (emphasis on the R).

This coworker has just come from a big meeting with various managers where
the technology came up and I'm afraid my battle over this very important
matter may already have been lost.
thank you,
Charles

RE: Initialising of CommonsHttpSolrServer in Spring framwork

2009-05-08 Thread sachin78


Ranjeeth,

Di you figured aout how to do this.If yes, can you share with me how you
did it? Example bean definition in xml will be helpful.

--Sachin


Funtick wrote:
 
 Use constructor and pass URL parameter. Nothing SPRING related... 
 
 Create a Spring bean with attributes 'MySolr', 'MySolrUrl', and 'init'
 method... 'init' will create instance of CommonsHttpSolrServer. Configure
 Spring...
 
 
 
 I am using Solr 1.3 and Solrj as a Java Client. I am 
 Integarating Solrj in Spring framwork, I am facing a problem, 
 Spring framework is not inializing CommonsHttpSolrServer 
 class, how can  I define this class to get the instance of 
 SolrServer to invoke furthur method on this. 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Initialising-of-CommonsHttpSolrServer-in-Spring-framwork-tp18808743p23451795.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to pronounce solr

2009-05-08 Thread Sean Timm

This is the funniest e-mail I've had all day.  SOLer is the typical 
pronunciation, but I've heard solAR as well.  It's the description of 
pirate-like that made me chuckle.


-Sean

Charles Federspiel wrote:

Hi,
My company is evaluating different open-source indexing and search software
and we are seriously considering Solr.
One of my collegues pronounces it differently than I do and I have no basis
of correcting him.
Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like,
SolAhhRrr (emphasis on the R).

This coworker has just come from a big meeting with various managers where
the technology came up and I'm afraid my battle over this very important
matter may already have been lost.
thank you,
Charles

Re: StatsComponent and 1.3

2009-05-08 Thread Erik Hatcher



On May 7, 2009, at 10:25 PM, David Shettler wrote:

I found myself with a real need for some of the StatsComponent stuff
(mainly the sum feature), so I pulled down a nightly build and played
with it.  StatsComponent proved perfect, but... the nightly build
output seems to be different, and thus incompatible with acts_as_solr.


Could you give some more details on what seems different and  
incompatible with acts_as_solr?


You can query the StatsComponent from Ruby using the solr-ruby  
library.  Using the example from the wiki at http://wiki.apache.org/solr/StatsComponent 
, it points to http://localhost:8983/solr/select?q=*:*stats=truestats.field=pricestats.field=popularityrows=0indent=true


require 'solr'
solr = Solr::Connection.new
solr.send(Solr::Request::Select.new(:standard, :q = '*:*', :stats =  
true, 'stats.field' = ['price','popularity'], :rows=0))


Which output this (in irb):

= #Solr::Response::Select:0x141692c @header={QTime=1,  
params={stats=true, qt=standard, stats.field=[price,  
popularity], q=*:*, rows=0, wt=ruby}, status=0},  
@raw_response 
= 
{'responseHeader 
'= 
 
{'status 
'= 
 
0 
,'QTime 
'= 
 
1 
,'params 
'= 
 
{'stats 
'= 
 
'true 
','q 
'= 
 
'*:*','stats 
.field 
'= 
 
['price 
','popularity 
'],'qt 
'= 
 
'standard 
','wt 
'= 
 
'ruby 
','rows 
'= 
 
'0 
'}},'response 
'= 
 
{'numFound 
'= 
 
26 
,'start 
'= 
 
0 
,'docs 
'= 
 
[]},'stats 
'= 
 
{'stats_fields 
'= 
 
{'price 
'= 
 
{'min 
'= 
 
0.0 
,'max 
'= 
 
2199.0 
,'sum 
'= 
 
5251.26995 
,'count 
'= 
 
15 
,'missing 
'= 
 
11 
,'sumOfSquares 
'= 
 
6038619.160315 
,'mean 
'= 
 
350.084664 
,'stddev 
'= 
 
547.737557906113 
},'popularity 
'= 
 
{'min 
'= 
 
0.0 
,'max 
'= 
 
10.0 
,'sum 
'= 
 
90.0 
,'count 
'= 
 
26 
,'missing 
'= 
 
0 
,'sumOfSquares 
'=628.0,'mean'=3.4615384615384617,'stddev'=3.5578731762756157,  
@data={response={start=0, docs=[], numFound=26},  
stats={stats_fields={price={sumOfSquares=6038619.1603,  
sum=5251.27, max=2199.0, mean=350.0847, count=15,  
stddev=547.737557906113, min=0.0, missing=11},  
popularity={sumOfSquares=628.0, sum=90.0, max=10.0,  
mean=3.46153846153846, count=26, stddev=3.55787317627562,  
min=0.0, missing=0}}}, responseHeader={QTime=1,  
params={stats=true, qt=standard, stats.field=[price,  
popularity], q=*:*, rows=0, wt=ruby}, status=0}}



Is it possible, with some degree of effort (obviously) for me to
essentially port some of the functionality of StatsComponent to 1.3
myself?  It's that, or waiting for 1.4 to come out and someone
developing support for it into acts_as_solr, or myself fixing what I
have for acts_as_solr to work with the output.  I'm just trying to
gauge the easiest solution :)


I'm unclear on what what the discrepancies are, so not quite sure how  
to help just yet.


As Eric asked, what version/branch of acts_as_solr are you using?

Erik

Re: how to pronounce solr

2009-05-08 Thread Jon Gorman

On Fri, May 8, 2009 at 2:07 PM, Charles Federspiel
charles.federsp...@gmail.com wrote:
 Hi,
 My company is evaluating different open-source indexing and search software
 and we are seriously considering Solr.
 One of my collegues pronounces it differently than I do and I have no basis
 of correcting him.
 Is Solr pronounced SOLerrr(emphasis on first syllable), or pirate-like,
 SolAhhRrr (emphasis on the R).

 This coworker has just come from a big meeting with various managers where
 the technology came up and I'm afraid my battle over this very important
 matter may already have been lost.
 thank you,


It's pronounced Solar.  However you choose to pronounce solar, of
course, is up to you or your regionalism.  But that's what explains
the sun logo and people who make puns about solr energy. ;).

It's also the third question in the FAQ
http://wiki.apache.org/solr/FAQ#head-0076d43a3911cf40a231e9ecf7df5303ccee0dc7.
 Just in case you need documented proof to argue your point.  Unless
of course you wanted it to be a pirate word.

Perhaps you should send him or her an eye-patch in case a correction
in this matter would hurt the collegues' feelings.  On second thought,
maybe not.

Jon Gorman

Re: JVM exception_access_violation

2009-05-08 Thread wojtekpia


I updated to Java 6 update 13 and have been running problem free for just
over a month. I'll continue this thread if I run into any problems that seem
to be related.


Yonik Seeley-2 wrote:
 
 I assume that you're not using any Tomcat native libs?  If you are,
 try removing them... if not (and the crash happened more than once in
 the same place) then it looks like a JVM bug rather than flakey
 hardware and the easiest path forward would be to try the latest Java6
 (update 12).
 

-- 
View this message in context: 
http://www.nabble.com/JVM-exception_access_violation-tp22623667p23451994.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is it possible to writing solr result on disk from the server side?

2009-05-08 Thread Yonik Seeley

On Thu, May 7, 2009 at 10:55 AM, arno13 arnaud.gaudi...@healthonnet.org wrote:
 Do you know if it's possible to writing solr results directly on a hard disk
 from server side and not to use an HTTP connection to transfer the results?

If you have something like a CSV file (or any other file type that
Solr accepts over HTTP), you can instruct that the body be read
directly from disk instead.

http://wiki.apache.org/solr/UpdateCSV

-Yonik
http://www.lucidimagination.com

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Jim Murphy


Any pointers to this newer more concurrent behavior in lucene?  I can try an
experiment where I downgrade the iwCommit lock to the iwAccess lock to allow
updates to happen during commit.  

Would you expect that to work?

Thanks for bootstrapping me on this. 

Jim



Yonik Seeley-2 wrote:
 
 On Thu, May 7, 2009 at 8:37 PM, Jim Murphy jim.mur...@pobox.com wrote:
 Interesting.  So is there a JIRA ticket open for this already? Any chance
 of
 getting it into 1.4?
 
 No ticket currently open, but IMO it could make it for 1.4.
 
 Its seriously kicking out butts right now.  We write
 into our masters with ~50ms response times till we hit the autocommit
 then
 add/update response time is 10-30 seconds.  Ouch.
 
 It's probably been made a little worse lately since Lucene now does
 fsync on index files before writing the segments file that points to
 those files.  A necessary evil to prevent index corruption.
 
 I'd be willing to work on submitting a patch given a better understanding
 of
 the issue.
 
 Great, go for it!
 
 -Yonik
 http://www.lucidimagination.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23452011.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Yonik Seeley

On Fri, May 8, 2009 at 4:27 PM, Jim Murphy jim.mur...@pobox.com wrote:

 Any pointers to this newer more concurrent behavior in lucene?

At the API level we care about, IndexWriter.commit() instead of close()

Also, we shouldn't have to worry about other parts of the code closing
the writer on us since things like deleteByQuery no longer need to
close the writer to work.

core.getSearcher()... if we don't lock until it's finished, then what
could currently happen is that you could wind up with a newer version
of the index than you thought you might.  I think this should be fine
though.

We'd need to think about what type of synchronization may be needed
for postCommit and postOptimize hooks too.

Here's the relevant code:

iwCommit.lock();
try {
  log.info(start +cmd);

  if (cmd.optimize) {
openWriter();
writer.optimize(cmd.maxOptimizeSegments);
  }

  closeWriter();

  callPostCommitCallbacks();
  if (cmd.optimize) {
callPostOptimizeCallbacks();
  }
  // open a new searcher in the sync block to avoid opening it
  // after a deleteByQuery changed the index, or in between deletes
  // and adds of another commit being done.
  core.getSearcher(true,false,waitSearcher);



-Yonik
http://www.lucidimagination.com

Re: no subject aka Replication Stall

2009-05-08 Thread Yonik Seeley

2009/5/7 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 BTW, if the timeout occurs it resumes from the point where the failure
 happened. It retries 5 times before giving up.

Sweet!  I was just going to ask what happens on a timeout.
Have you tested this out (say kill the master in the middle of
replicating a big file)?

-Yonik
http://www.lucidimagination.com

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard


Created issue:

https://issues.apache.org/jira/browse/SOLR-1155



Jim Murphy wrote:
 
 Any pointers to this newer more concurrent behavior in lucene?  I can try
 an experiment where I downgrade the iwCommit lock to the iwAccess lock to
 allow updates to happen during commit.  
 
 Would you expect that to work?
 
 Thanks for bootstrapping me on this. 
 
 Jim
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23453693.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Jim Murphy




Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 



Don't take this code as correct (or even compiling) but is this the essence? 
I moved shared access to the writer inside the read lock and kept the other
non-commit bits to the write lock.  I'd need to rethink the locking in a
more fundamental way but is this close to idea? 



 public void commit(CommitUpdateCommand cmd) throws IOException {

if (cmd.optimize) {
  optimizeCommands.incrementAndGet();
} else {
  commitCommands.incrementAndGet();
}

Future[] waitSearcher = null;
if (cmd.waitSearcher) {
  waitSearcher = new Future[1];
}

boolean error=true;
iwCommit.lock();
try {
  log.info(start +cmd);

  if (cmd.optimize) {
closeSearcher();
openWriter();
writer.optimize(cmd.maxOptimizeSegments);
  }
finally {
  iwCommit.unlock();
 }


  iwAccess.lock(); 
  try
 {
  writer.commit();
 }
 finally
 {
  iwAccess.unlock(); 
 }

  iwCommit.lock(); 
  try
 {
  callPostCommitCallbacks();
  if (cmd.optimize) {
callPostOptimizeCallbacks();
  }
  // open a new searcher in the sync block to avoid opening it
  // after a deleteByQuery changed the index, or in between deletes
  // and adds of another commit being done.
  core.getSearcher(true,false,waitSearcher);

  // reset commit tracking
  tracker.didCommit();

  log.info(end_commit_flush);

  error=false;
}
finally {
  iwCommit.unlock();
  addCommands.set(0);
  deleteByIdCommands.set(0);
  deleteByQueryCommands.set(0);
  numErrors.set(error ? 1 : 0);
}

// if we are supposed to wait for the searcher to be registered, then we
should do it
// outside of the synchronized block so that other update operations can
proceed.
if (waitSearcher!=null  waitSearcher[0] != null) {
   try {
waitSearcher[0].get();
  } catch (InterruptedException e) {
SolrException.log(log,e);
  } catch (ExecutionException e) {
SolrException.log(log,e);
  }
}
  }



-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23454419.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard


Can we move this to patch files within the JIRA issue please.  Will make it
easier to review and help out a as a patch to current trunk.

--j


Jim Murphy wrote:
 
 
 
 Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 
 
 
 
 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock and
 kept the other non-commit bits to the write lock.  I'd need to rethink the
 locking in a more fundamental way but is this close to idea? 
 
 
 
  public void commit(CommitUpdateCommand cmd) throws IOException {
 
 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }
 
 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }
 
 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);
 
   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }
 
 
   iwAccess.lock(); 
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock(); 
  }
 
   iwCommit.lock(); 
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);
 
   // reset commit tracking
   tracker.didCommit();
 
   log.info(end_commit_flush);
 
   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }
 
 // if we are supposed to wait for the searcher to be registered, then
 we should do it
 // outside of the synchronized block so that other update operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23455432.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Initialising of CommonsHttpSolrServer in Spring framwork

2009-05-08 Thread Avlesh Singh

I am giving you a detailed sample of my spring usage.

bean id=solrHttpClient class=org.apache.commons.httpclient.HttpClient
property name=httpConnectionManager
bean
class=org.apache.commons.httpclient.MultiThreadedHttpConnectionManager
property name=maxConnectionsPerHost value=10/
property name=maxTotalConnections value=10/
/bean
/property
/bean

bean id=mySearchImpl class=com.me.search.MySearchSolrImpl
property name=core1
bean
class=org.apache.solr.client.solrj.impl.CommonsHttpSolrServer
constructor-arg value=http://localhost/solr/core1/
constructor-arg ref=solrHttpClient/
/bean
/property
property name=core2
bean
class=org.apache.solr.client.solrj.impl.CommonsHttpSolrServer
constructor-arg value=http://localhost/solr/core2/
constructor-arg ref=solrHttpClient/
/bean
/property
/bean

Hope this helps.

Cheers
Avlesh

On Sat, May 9, 2009 at 12:39 AM, sachin78 tendulkarsachi...@gmail.comwrote:


 Ranjeeth,

Did you figured aout how to do this? If yes, can you share with me how
 you did it? Example bean definition in xml will be helpful.

 --Sachin


 Funtick wrote:
 
  Use constructor and pass URL parameter. Nothing SPRING related...
 
  Create a Spring bean with attributes 'MySolr', 'MySolrUrl', and 'init'
  method... 'init' will create instance of CommonsHttpSolrServer. Configure
  Spring...
 
 
 
  I am using Solr 1.3 and Solrj as a Java Client. I am
  Integarating Solrj in Spring framwork, I am facing a problem,
  Spring framework is not inializing CommonsHttpSolrServer
  class, how can  I define this class to get the instance of
  SolrServer to invoke furthur method on this.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Initialising-of-CommonsHttpSolrServer-in-Spring-framwork-tp18808743p23451795.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solrconfig.xml

2009-05-08 Thread Shalin Shekhar Mangar

On Thu, May 7, 2009 at 10:06 AM, Francis Yakin fya...@liquid.com wrote:


 No error, attached is solrconfig.xml files( one is from 1.2.0 that works
 and the other is 1.3.0 that doesn't work)


Francis, it seems the attached files were eaten up by the mailing list. Can
you re-send or put them up online somewhere (e.g. on pastebin.com)?
-- 
Regards,
Shalin Shekhar Mangar.

Re: Control segment size

2009-05-08 Thread Shalin Shekhar Mangar

On Fri, May 8, 2009 at 1:30 AM, vivek sar vivex...@gmail.com wrote:


 I did set the maxMergeDocs to 10M, but I still see couple of index
 files over 30G which do not match with max number of documents. Here
 are some numbers,

 1) My total index size = 66GB
 2) Number of total documents = 200M
 3) 1M doc = 300MB
 4) 10M doc should be roughly around 3-4GB.

 As you can see couple of files are huge. Are those documents or index
 files? How can I control the file size so no single file grows more
 than 10GB.


No, there is no way to limit an individual file to a specific size.

-- 
Regards,
Shalin Shekhar Mangar.

Re: no subject aka Replication Stall

2009-05-08 Thread Noble Paul നോബിള്‍ नोब्ळ्

On Sat, May 9, 2009 at 2:23 AM, Yonik Seeley yo...@lucidimagination.com wrote:
 2009/5/7 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 BTW, if the timeout occurs it resumes from the point where the failure
 happened. It retries 5 times before giving up.

 Sweet!  I was just going to ask what happens on a timeout.
 Have you tested this out (say kill the master in the middle of
 replicating a big file)?

Actually yes,

not by killing the master (then the replication will abort). I
modified the master code to close the connection after transferring x
MB of data. The slave retried an completed the replication .

 -Yonik
 http://www.lucidimagination.com




-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard


First cut of updated handler now in:
https://issues.apache.org/jira/browse/SOLR-1155

Needs review from those that know Lucene better, and double check for errors
in locking or other areas of the code.  Thanks.

--j


jayson.minard wrote:
 
 Can we move this to patch files within the JIRA issue please.  Will make
 it easier to review and help out a as a patch to current trunk.
 
 --j
 
 
 Jim Murphy wrote:
 
 
 
 Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 
 
 
 
 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock and
 kept the other non-commit bits to the write lock.  I'd need to rethink
 the locking in a more fundamental way but is this close to idea? 
 
 
 
  public void commit(CommitUpdateCommand cmd) throws IOException {
 
 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }
 
 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }
 
 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);
 
   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }
 
 
   iwAccess.lock(); 
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock(); 
  }
 
   iwCommit.lock(); 
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);
 
   // reset commit tracking
   tracker.didCommit();
 
   log.info(end_commit_flush);
 
   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }
 
 // if we are supposed to wait for the searcher to be registered, then
 we should do it
 // outside of the synchronized block so that other update operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23457422.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: French and SpellingQueryConverter

2009-05-08 Thread Shalin Shekhar Mangar

On Fri, May 8, 2009 at 2:14 AM, Jonathan Mamou ma...@il.ibm.com wrote:

 Hi
 It does not seem to be related to FrenchStemmer, the stemmer does not split
 a word into 2 words. I have checked with other words and
 SpellingQueryConverter always splits words with special character.
 I think that the issue is in SpellingQueryConverter class
 Pattern.compile.((?:(?!(\\w+:|\\d+)))\\w+);?:
 According to
 http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html,
 \w A word character: [a-zA-Z_0-9]
 I think that special character should also be added to the regex.


If you use spellcheck.q parameter for specifying the spelling query, then
the field's analyzer will be used (in this case, FrenchAnalyzer). If you use
the q parameter, then the SpellingQueryConverter is used.

-- 
Regards,
Shalin Shekhar Mangar.

39 matches

Mail list logo