Re: [jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-04 Thread Grant Ingersoll
There are working patches available on the issue without the "advanced  
features" and everyone is free to fix the current one.  It's not like  
it is that far off from being able to have proper spellchecking,  
pluggability, and context information about where the mistakes are.  I  
frankly don't get what all the fuss is about.


Is it that you disagree with the approach?  That hasn't come across in  
the discussions, but if it is, say so.  I thought we were working on  
it quite well together and made some good progress and are pretty darn  
close.  I don't see that I've taken away any functionality that the  
original patch offers, but I did change it so that it fits a broader  
audience, namely those who are interested in other spell checkers and  
those who want info about where in the query the problem occurs.   
Which, is what the comments suggest people are interested in and also  
what I am interested in for 1.3.


And, I'm sorry, but I said I'd have to let it lie for a few days and  
then I would be back to it.  Cut me some slack.  I don't get paid to  
work on Solr full time.   Is it truly that important that someone  
can't wait a few days for a patch on the trunk version for something  
they never had before?  It ain't like we're talking some core bug here  
that has everyone broken.  Besides, others are perfectly welcome to  
work on it in the meantime.


Sorry for the rant, but I am not going to be pressured into committing  
a patch that I don't think is ready and one that I said I am going to  
be working on to see it through so that we all are happy.


-Grant

On Jun 4, 2008, at 1:14 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:


On Wed, Jun 4, 2008 at 2:15 AM, Grant Ingersoll  
<[EMAIL PROTECTED]> wrote:
I will be back on it tomorrow and will see this through before 1.3  
with the
abstractions.  In other words, -1 on cutting this off  
prematurely.  :-)
Since I don't think this is the only thing holding up 1.3, let's  
just play

it out and get it right so all of us are happy.


This feature may not be holding back 1.3 release. The potential users
of this issue are very much interested in a basic working version.
They may be able to live without these advanced features. May be we
can have another jira issue for enhancements which may/may not go into
1.3 (depending on when it happens).





-Grant

On Jun 3, 2008, at 3:53 PM, Shalin Shekhar Mangar wrote:

The current patch has been broken for some days now and  
implementing a
correct query parsing logic may take time to get right. Let's not  
aim for

everything to get into the 1.3 release.

I would like to cut down the scope of this issue to a  
implementation that

indexes files and Lucene indices (both Solr and arbitary) and gives
suggestions while using the correct analyzer for multi-word  
queries. Let's

get a spell checker working and commit it. We can deal with more
enhancements like abstractions for custom spellcheckers and query  
parsing

etc. in another issue which can be dealt with separately (in 1.3 or
after).
Thoughts? If there is a general consensus, I can give a new patch  
which

can
be good enough to go in.

On Sat, May 31, 2008 at 2:44 AM, Oleg Gnatovskiy (JIRA) <[EMAIL PROTECTED] 
>

wrote:



[

https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601256 
#action_12601256]


Oleg Gnatovskiy commented on SOLR-572:
--

I installed the latest patch. Still getting a NPE. Here is my  
config:




  
  false
  
  false
  
  1


 
  name="classname">org.apache.solr.spelling.FileBasedSpellCheckerstr>

  external
   spellings.txt
   UTF-8
   text_ws
  name="indexDir">/usr/local/apache/lucene/solr2home/solr/data/ 
spellIndex





Here is the URL I am hitting:

http://localhost:8983/solr/select/?q=pizza&spellcheck=true&spellcheck.dictionary=external&spellcheck.build=true

Here is the error:

HTTP Status 500 - null java.lang.NullPointerException at
org.apache.lucene.index.Term.(Term.java:39) at
org.apache.lucene.index.Term.(Term.java:36) at

org 
.apache 
.lucene 
.search.spell.SpellChecker.suggestSimilar(SpellChecker.java:228)

at

org 
.apache 
.solr 
.spelling 
.AbstractLuceneSpellChecker 
.getSuggestions(AbstractLuceneSpellChecker.java:71)

at

org 
.apache 
.solr 
.handler 
.component.SpellCheckComponent.process(SpellCheckComponent.java: 
177)

at

org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:153)

at

org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
125)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at

org 
.apache 
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java: 
339)

at

org 
.apache 
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 
274)

at

org 
.apache 
.catalina 
.core 
.ApplicationFilterChain 
.internalDoFilter(ApplicationFilterChain.java:235)

at

org 
.apache 
.c

[jira] Assigned: (SOLR-590) Limitation in pgrep on Linux platform breaks script-utils fixUser

2008-06-04 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au reassigned SOLR-590:


Assignee: Bill Au

> Limitation in pgrep on Linux platform breaks script-utils fixUser 
> --
>
> Key: SOLR-590
> URL: https://issues.apache.org/jira/browse/SOLR-590
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.2
> Environment: Linux 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5 11:37:38 EST 
> 2008 x86_64 x86_64 x86_64 GNU/Linux
> procps-3.2.7-8.1.el5
>Reporter: Hannes Schmidt
>Assignee: Bill Au
>Priority: Minor
> Attachments: fixUser.patch
>
>
> The fixUser function in script-utils uses two methods to determine the 
> username of the parent process (oldwhoami). If the first method fails for 
> certain reasons it will fallback to the second method. For most people the 
> first method will succeed but I know that in my particular installation the 
> first method fails so I need the second method to succeed. Unfortunately, 
> that fallback method doesn't work because it uses pgrep to lookup the current 
> script's name and on my Linux 2.6.18 platform pgrep is limited to 15 
> characters. The names of many scripts in the SOLR distribution are longer 
> than that, causing pgrep to return nothing and the subsequent ps invocation 
> to fail with an error:
> ERROR: List of process IDs must follow -p.
> You can easily reproduce that behaviour with
> /app/solr/solr/bin/snappuller-enable < /dev/null
> The redirection of stdin from /dev/null causes fixUser to fallback to the 
> second method but there are other, more realistic scenarios in which the 
> fallback happens, like
> ssh [EMAIL PROTECTED] /app/solr/solr/bin/snappuller-enable
> The fix is to use the -f option which causes pgrep to compare the full path 
> of the executable. Interestingly, that method is not subject to the 15 
> character length limit. The limit is not actually enforced by jetty but 
> rather by the procfs file system of the linux kernel. If you look at 
> /proc/*/stat you will notice that the second column is limited to 15 
> characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-590) Limitation in pgrep on Linux platform breaks script-utils fixUser

2008-06-04 Thread Bill Au (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Au resolved SOLR-590.
--

Resolution: Fixed

Patch commited and CHANGES.txt updated.  Thanks Hannes.

SendingCHANGES.txt
Sendingsrc/scripts/scripts-util
Transmitting file data ..
Committed revision 663089.


> Limitation in pgrep on Linux platform breaks script-utils fixUser 
> --
>
> Key: SOLR-590
> URL: https://issues.apache.org/jira/browse/SOLR-590
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.2
> Environment: Linux 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5 11:37:38 EST 
> 2008 x86_64 x86_64 x86_64 GNU/Linux
> procps-3.2.7-8.1.el5
>Reporter: Hannes Schmidt
>Assignee: Bill Au
>Priority: Minor
> Attachments: fixUser.patch
>
>
> The fixUser function in script-utils uses two methods to determine the 
> username of the parent process (oldwhoami). If the first method fails for 
> certain reasons it will fallback to the second method. For most people the 
> first method will succeed but I know that in my particular installation the 
> first method fails so I need the second method to succeed. Unfortunately, 
> that fallback method doesn't work because it uses pgrep to lookup the current 
> script's name and on my Linux 2.6.18 platform pgrep is limited to 15 
> characters. The names of many scripts in the SOLR distribution are longer 
> than that, causing pgrep to return nothing and the subsequent ps invocation 
> to fail with an error:
> ERROR: List of process IDs must follow -p.
> You can easily reproduce that behaviour with
> /app/solr/solr/bin/snappuller-enable < /dev/null
> The redirection of stdin from /dev/null causes fixUser to fallback to the 
> second method but there are other, more realistic scenarios in which the 
> fallback happens, like
> ssh [EMAIL PROTECTED] /app/solr/solr/bin/snappuller-enable
> The fix is to use the -f option which causes pgrep to compare the full path 
> of the executable. Interestingly, that method is not subject to the 15 
> character length limit. The limit is not actually enforced by jetty but 
> rather by the procfs file system of the linux kernel. If you look at 
> /proc/*/stat you will notice that the second column is limited to 15 
> characters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Solr Maven Artifacts

2008-06-04 Thread Andrew Savory
Hi,

I see from http://issues.apache.org/jira/browse/SOLR-19 that some tentative
work has been done on mavenisation of solr, and from
https://issues.apache.org/jira/browse/SOLR-586 that discussion of publishing
maven artifacts ... is it possible to push solr 1.2 maven artifacts out to
the repo?


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: [jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-04 Thread Shalin Shekhar Mangar
Hi Grant,

I did not intend to offend you or put pressure on you in any way. Please
accept my apologies if I came off as rude. In fact, I've been having a lot
of fun working with you and Bojan on this issue. We've definitely covered a
lot of ground very fast.

I completely in favor of the goals for this piece. I was merely suggesting
that with the 1.3 release being a priority, we should go one step at a time
and commit per the initial scope for this issue as written in the issue's
description and then handle the enhancements in another issue. But I'm all
for it if you want to add extra functionality within the same issue.

Once again, I'm deeply sorry if you found my comment offending in any way.

Regards,
Shalin

On Wed, Jun 4, 2008 at 4:33 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:

> There are working patches available on the issue without the "advanced
> features" and everyone is free to fix the current one.  It's not like it is
> that far off from being able to have proper spellchecking, pluggability, and
> context information about where the mistakes are.  I frankly don't get what
> all the fuss is about.
>
> Is it that you disagree with the approach?  That hasn't come across in the
> discussions, but if it is, say so.  I thought we were working on it quite
> well together and made some good progress and are pretty darn close.  I
> don't see that I've taken away any functionality that the original patch
> offers, but I did change it so that it fits a broader audience, namely those
> who are interested in other spell checkers and those who want info about
> where in the query the problem occurs.  Which, is what the comments suggest
> people are interested in and also what I am interested in for 1.3.
>
> And, I'm sorry, but I said I'd have to let it lie for a few days and then I
> would be back to it.  Cut me some slack.  I don't get paid to work on Solr
> full time.   Is it truly that important that someone can't wait a few days
> for a patch on the trunk version for something they never had before?  It
> ain't like we're talking some core bug here that has everyone broken.
>  Besides, others are perfectly welcome to work on it in the meantime.
>
> Sorry for the rant, but I am not going to be pressured into committing a
> patch that I don't think is ready and one that I said I am going to be
> working on to see it through so that we all are happy.
>
> -Grant
>
>
> On Jun 4, 2008, at 1:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
>
>  On Wed, Jun 4, 2008 at 2:15 AM, Grant Ingersoll <[EMAIL PROTECTED]>
>> wrote:
>>
>>> I will be back on it tomorrow and will see this through before 1.3 with
>>> the
>>> abstractions.  In other words, -1 on cutting this off prematurely.  :-)
>>> Since I don't think this is the only thing holding up 1.3, let's just
>>> play
>>> it out and get it right so all of us are happy.
>>>
>>
>> This feature may not be holding back 1.3 release. The potential users
>> of this issue are very much interested in a basic working version.
>> They may be able to live without these advanced features. May be we
>> can have another jira issue for enhancements which may/may not go into
>> 1.3 (depending on when it happens).
>>
>>
>>
>>
>>> -Grant
>>>
>>> On Jun 3, 2008, at 3:53 PM, Shalin Shekhar Mangar wrote:
>>>
>>>  The current patch has been broken for some days now and implementing a
 correct query parsing logic may take time to get right. Let's not aim
 for
 everything to get into the 1.3 release.

 I would like to cut down the scope of this issue to a implementation
 that
 indexes files and Lucene indices (both Solr and arbitary) and gives
 suggestions while using the correct analyzer for multi-word queries.
 Let's
 get a spell checker working and commit it. We can deal with more
 enhancements like abstractions for custom spellcheckers and query
 parsing
 etc. in another issue which can be dealt with separately (in 1.3 or
 after).
 Thoughts? If there is a general consensus, I can give a new patch which
 can
 be good enough to go in.

 On Sat, May 31, 2008 at 2:44 AM, Oleg Gnatovskiy (JIRA) <
 [EMAIL PROTECTED]>
 wrote:


> [
>
>
> https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601256
> #action_12601256]
>
> Oleg Gnatovskiy commented on SOLR-572:
> --
>
> I installed the latest patch. Still getting a NPE. Here is my config:
>
>  class="org.apache.solr.handler.component.SpellCheckComponent">
> 
>  
>  false
>  
>  false
>  
>  1
> 
>
>  
>   name="classname">org.apache.solr.spelling.FileBasedSpellChecker
>  external
>   spellings.txt
>   UTF-8
>   text_ws
>  
>
> name="indexDir">/usr/local/apache/lucene/solr2home/solr/data/spellIndex
> 
> 
>

Re: [jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-04 Thread Otis Gospodnetic
Yeah, as an observer I sensed no bad intentions here.

Anyhow, 1.3 is not scheduled yet, my guess is we are still at least a few weeks 
away from 1.3 (and if I had to bet I'd bet at 1.3 being released close to the 
end of summer).  Grant is very eager about this and will get it all in.  Case 
closed, I think.  Nothing to see here, move along.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Shalin Shekhar Mangar <[EMAIL PROTECTED]>
> To: solr-dev@lucene.apache.org
> Sent: Wednesday, June 4, 2008 1:32:48 PM
> Subject: Re: [jira] Commented: (SOLR-572) Spell Checker as a Search Component
> 
> Hi Grant,
> 
> I did not intend to offend you or put pressure on you in any way. Please
> accept my apologies if I came off as rude. In fact, I've been having a lot
> of fun working with you and Bojan on this issue. We've definitely covered a
> lot of ground very fast.
> 
> I completely in favor of the goals for this piece. I was merely suggesting
> that with the 1.3 release being a priority, we should go one step at a time
> and commit per the initial scope for this issue as written in the issue's
> description and then handle the enhancements in another issue. But I'm all
> for it if you want to add extra functionality within the same issue.
> 
> Once again, I'm deeply sorry if you found my comment offending in any way.
> 
> Regards,
> Shalin
> 
> On Wed, Jun 4, 2008 at 4:33 PM, Grant Ingersoll wrote:
> 
> > There are working patches available on the issue without the "advanced
> > features" and everyone is free to fix the current one.  It's not like it is
> > that far off from being able to have proper spellchecking, pluggability, and
> > context information about where the mistakes are.  I frankly don't get what
> > all the fuss is about.
> >
> > Is it that you disagree with the approach?  That hasn't come across in the
> > discussions, but if it is, say so.  I thought we were working on it quite
> > well together and made some good progress and are pretty darn close.  I
> > don't see that I've taken away any functionality that the original patch
> > offers, but I did change it so that it fits a broader audience, namely those
> > who are interested in other spell checkers and those who want info about
> > where in the query the problem occurs.  Which, is what the comments suggest
> > people are interested in and also what I am interested in for 1.3.
> >
> > And, I'm sorry, but I said I'd have to let it lie for a few days and then I
> > would be back to it.  Cut me some slack.  I don't get paid to work on Solr
> > full time.   Is it truly that important that someone can't wait a few days
> > for a patch on the trunk version for something they never had before?  It
> > ain't like we're talking some core bug here that has everyone broken.
> >  Besides, others are perfectly welcome to work on it in the meantime.
> >
> > Sorry for the rant, but I am not going to be pressured into committing a
> > patch that I don't think is ready and one that I said I am going to be
> > working on to see it through so that we all are happy.
> >
> > -Grant
> >
> >
> > On Jun 4, 2008, at 1:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> >
> >  On Wed, Jun 4, 2008 at 2:15 AM, Grant Ingersoll 
> >> wrote:
> >>
> >>> I will be back on it tomorrow and will see this through before 1.3 with
> >>> the
> >>> abstractions.  In other words, -1 on cutting this off prematurely.  :-)
> >>> Since I don't think this is the only thing holding up 1.3, let's just
> >>> play
> >>> it out and get it right so all of us are happy.
> >>>
> >>
> >> This feature may not be holding back 1.3 release. The potential users
> >> of this issue are very much interested in a basic working version.
> >> They may be able to live without these advanced features. May be we
> >> can have another jira issue for enhancements which may/may not go into
> >> 1.3 (depending on when it happens).
> >>
> >>
> >>
> >>
> >>> -Grant
> >>>
> >>> On Jun 3, 2008, at 3:53 PM, Shalin Shekhar Mangar wrote:
> >>>
> >>>  The current patch has been broken for some days now and implementing a
>  correct query parsing logic may take time to get right. Let's not aim
>  for
>  everything to get into the 1.3 release.
> 
>  I would like to cut down the scope of this issue to a implementation
>  that
>  indexes files and Lucene indices (both Solr and arbitary) and gives
>  suggestions while using the correct analyzer for multi-word queries.
>  Let's
>  get a spell checker working and commit it. We can deal with more
>  enhancements like abstractions for custom spellcheckers and query
>  parsing
>  etc. in another issue which can be dealt with separately (in 1.3 or
>  after).
>  Thoughts? If there is a general consensus, I can give a new patch which
>  can
>  be good enough to go in.
> 
>  On Sat, May 31, 2008 at 2:44 AM, Oleg Gnatovskiy (JIRA) <
>  [EMAIL PROTECTED]>
>  wro

[jira] Updated: (SOLR-536) Automatic binding of results to Beans (for solrj)

2008-06-04 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-536:
---

Attachment: SOLR-536.patch

Here is an updated version of the patch that also lets you use the same 
annotation to convert from an object to a SolrInputDocument.

I still wonder if we should have the utility function in QueryResponse.java:
{code:java}
public  List getBeans(Class klass){
  return new DocumentObjectBinder().getBeans(klass, getResults());
}
{code}
it seems like hanging on to a DocumentObjectBinder outside of the response will 
be more efficient.  (Though perhaps not a big deal)

> Automatic binding of results to Beans (for solrj)
> -
>
> Key: SOLR-536
> URL: https://issues.apache.org/jira/browse/SOLR-536
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Ryan McKinley
>Priority: Minor
> Attachments: SOLR-536.patch, SOLR-536.patch, SOLR-536.patch
>
>
> as we are using java5 .we can use annotations to bind SolrDocument to java 
> beans directly.
> This can make the usage of solrj a  bit simpler
> The QueryResponse class in solrj can have an extra method as follows
> public  List getResultBeans(Class klass)
> and the bean can have annotations as
> class MyBean{
> @Field("id") //name is optional
> String id;
> @Field("category")
> List categories
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



3 TokenFilter factories not compatible with 1.2

2008-06-04 Thread Chris Hostetter


As pointed out in this thread...

http://www.nabble.com/NullPointerException-at-lucene.analysis.StopFilter-with-1.3-to17564627.html#a17564627

3 of the TokenFilterFactories currently on the trunk are not actually 
backwards compatible with Solr 1.2...

  - StopFilterFactory
  - SynonymFilterFactory
  - EnglishPorterFilterFactory

This is because they were changed so that "setup" code formerly in the 
init(Map) method was moved to the new inform(ResourceLoader) method in 
order to make them no longer depend on the SolrCore.getSolrCore() 
singleton.


Introducing the ResourceLoaderAware interface allowed us to ensure that 
any existing custom analysis Factories people might have registered in 
their schema could still continue to work as they did -- but by changing 
these 3 factory implementations we broke the somewhat unexpected usecase 
of people with code that explicitly constructs a StopFilterFactory and 
calls init on it expecting it to now be ready for use.


Three possible ways of dealing with this incompatibility come to mind...

1) Delayed Initialization on First Use.
We can add code to the create method of each of these factories that does 
a quick check to see if the inform method was ever called, and if it 
wasn't then use the SolrCore singleton to do so...

  if(null == stopWords) { // :TODO:remove when singlton is removed
   this.inform(SolrCore.getSolrCore().getSolrConfig().getResourceLoader());
  }
...this could be made more robust using various lazy initialization 
techniques (fun fact i learned from Josh Bloch last week: double checked 
locking does work in Java1.5 if you use volitile and cut/paste it exactly 
from his book so that you use the appropriate temporary variable)


2) Superclass Insertion
Rename the current factory implementaitons using new names, create new 
(deprecated) subclasses using the existing names that call make the same 
"this.inform(...)" call as mentioned above but during the init method. 
change the example schema to advocate the new class names and advocate 
in CHANGES.txt that existing users change their schemas to refer to the 
new names.


3) Documentation and Education
Since this wasn't exactly a use case we ever advertised, we could punt on 
the problem by putting a disclaimer in the CAHNGES.txt that ayone directly 
constructing those 3 classes should explicitly call inform() on the 
instances after calling init.



#3 is obviously the simplest approach as developers, and to be quite 
honest: probably impacts the fewest total number of people (since there 
are probably very few people constructing Factory instances themselves) 
compared to the potential performance impacts of #1, or the need for many 
people to change their schemas in order to benefit from MultiCores if 
we go with #2 (particularly since with option #2, users with existing 
schemas that don't change them, but do start using multicores will 
silently get stopwords and synonms from the "last" core loaded in all 
other cores).




Opinions?




-Hoss



Re: Marking myself inactive (was: svn commit: r662366...)

2008-06-04 Thread Chris Hostetter

: The few projects that I did during my short involvement with Solr are
: still running strong, keep up the good work!

I think it's safe to say I speak for all of the other committers when I 
thank you for all your past contributions that really helped Solr thrive.

Not just your code contributions, but the way you really helped "seed" the 
Solr community while we were incubating:  Your involvement with Solr 
attracted interest from a broader set of contributors within the ASF 
(besides just Lucene folks), and your 2006 XML.com article really put Solr 
on the map so to speek, gaining us a lot of attention and credibility in 
the eyes of people that would have never noticed Solr otherwise.

I look forward to the day when you get bored with all the stuff you 
are doing now, and start catching up on all your Solr mail :)

-Hoss



Re: 3 TokenFilter factories not compatible with 1.2

2008-06-04 Thread Yonik Seeley
On Wed, Jun 4, 2008 at 7:03 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
> 3) Documentation and Education
> Since this wasn't exactly a use case we ever advertised, we could punt on
> the problem by putting a disclaimer in the CAHNGES.txt that ayone directly
> constructing those 3 classes should explicitly call inform() on the
> instances after calling init.
>
>
> #3 is obviously the simplest approach as developers, and to be quite honest:
> probably impacts the fewest total number of people (since there are probably
> very few people constructing Factory instances themselves)

+1

-Yonik


Re: 3 TokenFilter factories not compatible with 1.2

2008-06-04 Thread Mike Klaas

On 4-Jun-08, at 5:24 PM, Yonik Seeley wrote:


On Wed, Jun 4, 2008 at 7:03 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:

3) Documentation and Education
Since this wasn't exactly a use case we ever advertised, we could  
punt on
the problem by putting a disclaimer in the CAHNGES.txt that ayone  
directly

constructing those 3 classes should explicitly call inform() on the
instances after calling init.


#3 is obviously the simplest approach as developers, and to be  
quite honest:
probably impacts the fewest total number of people (since there are  
probably

very few people constructing Factory instances themselves)


+1


+1, perhaps also pinging -user to see if there is a sizable group of  
people doing this.


-Mike


Re: 3 TokenFilter factories not compatible with 1.2

2008-06-04 Thread Grant Ingersoll


On Jun 4, 2008, at 7:03 PM, Chris Hostetter wrote:



As pointed out in this thread...

http://www.nabble.com/NullPointerException-at-lucene.analysis.StopFilter-with-1.3-to17564627.html#a17564627

3) Documentation and Education
Since this wasn't exactly a use case we ever advertised, we could  
punt on the problem by putting a disclaimer in the CAHNGES.txt that  
ayone directly constructing those 3 classes should explicitly call  
inform() on the instances after calling init.






+1.




[jira] Commented: (SOLR-561) Solr replication by Solr (for windows also)

2008-06-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602536#action_12602536
 ] 

Noble Paul commented on SOLR-561:
-

The next step is to replicate files in conf folder. 
Thre strategy is as follows,
*  Mention the files to be replicated from the master, in the 
_ReplicationHandler_
{code:xml}
   
  schema.xml,stopwords.txt,elevate.xml  
  
{code}
* For the CMD_FILE_LIST command response include these files also 
* The slave can compare the files with its local copy and if it is modified 
download them
* If a conf file is changed the the SolrCore must be reloaded
* There must be separate strategies for reloading core for single core or 
multicore

> Solr replication by Solr (for windows also)
> ---
>
> Key: SOLR-561
> URL: https://issues.apache.org/jira/browse/SOLR-561
> Project: Solr
>  Issue Type: New Feature
>  Components: replication
>Affects Versions: 1.3
> Environment: All
>Reporter: Noble Paul
> Attachments: SOLR-561.patch, SOLR-561.patch
>
>
> The current replication strategy in solr involves shell scripts . The 
> following are the drawbacks with the approach
> *  It does not work with windows
> * Replication works as a separate piece not integrated with solr.
> * Cannot control replication from solr admin/JMX
> * Each operation requires manual telnet to the host
> Doing the replication in java has the following advantages
> * Platform independence
> * Manual steps can be completely eliminated. Everything can be driven from 
> solrconfig.xml .
> ** Adding the url of the master in the slaves should be good enough to enable 
> replication. Other things like frequency of
> snapshoot/snappull can also be configured . All other information can be 
> automatically obtained.
> * Start/stop can be triggered from solr/admin or JMX
> * Can get the status/progress while replication is going on. It can also 
> abort an ongoing replication
> * No need to have a login into the machine 
> This issue can track the implementation of solr replication in java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-561) Solr replication by Solr (for windows also)

2008-06-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602536#action_12602536
 ] 

noble.paul edited comment on SOLR-561 at 6/4/08 9:31 PM:
-

The next step is to replicate files in conf folder. 
Thre strategy is as follows,
*  Mention the files to be replicated from the master, in the 
_ReplicationHandler_
{code:xml}
   
  schema.xml,stopwords.txt,elevate.xml  
  
{code}
* For the CMD_FILE_LIST command response include these files also 
* The slave can compare the files with its local copy and if it is modified 
download them
* A backup of the current files are taken and the new files are placed into the 
conf folder
* If a conf file is changed the the SolrCore must be reloaded
* There must be separate strategies for reloading core for single core or 
multicore

  was (Author: noble.paul):
The next step is to replicate files in conf folder. 
Thre strategy is as follows,
*  Mention the files to be replicated from the master, in the 
_ReplicationHandler_
{code:xml}
   
  schema.xml,stopwords.txt,elevate.xml  
  
{code}
* For the CMD_FILE_LIST command response include these files also 
* The slave can compare the files with its local copy and if it is modified 
download them
* If a conf file is changed the the SolrCore must be reloaded
* There must be separate strategies for reloading core for single core or 
multicore
  
> Solr replication by Solr (for windows also)
> ---
>
> Key: SOLR-561
> URL: https://issues.apache.org/jira/browse/SOLR-561
> Project: Solr
>  Issue Type: New Feature
>  Components: replication
>Affects Versions: 1.3
> Environment: All
>Reporter: Noble Paul
> Attachments: SOLR-561.patch, SOLR-561.patch
>
>
> The current replication strategy in solr involves shell scripts . The 
> following are the drawbacks with the approach
> *  It does not work with windows
> * Replication works as a separate piece not integrated with solr.
> * Cannot control replication from solr admin/JMX
> * Each operation requires manual telnet to the host
> Doing the replication in java has the following advantages
> * Platform independence
> * Manual steps can be completely eliminated. Everything can be driven from 
> solrconfig.xml .
> ** Adding the url of the master in the slaves should be good enough to enable 
> replication. Other things like frequency of
> snapshoot/snappull can also be configured . All other information can be 
> automatically obtained.
> * Start/stop can be triggered from solr/admin or JMX
> * Can get the status/progress while replication is going on. It can also 
> abort an ongoing replication
> * No need to have a login into the machine 
> This issue can track the implementation of solr replication in java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-536) Automatic binding of results to Beans (for solrj)

2008-06-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602539#action_12602539
 ] 

Noble Paul commented on SOLR-536:
-

The objective was to mask people from _DocumentObjectBinder_ altogether .

The process of reading the annotations and caching the information is expensive.
 If we make the cache field  *static* in DocumentIObjectBinder this should be 
fine. Otherwise we must remove this method.

> Automatic binding of results to Beans (for solrj)
> -
>
> Key: SOLR-536
> URL: https://issues.apache.org/jira/browse/SOLR-536
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Ryan McKinley
>Priority: Minor
> Attachments: SOLR-536.patch, SOLR-536.patch, SOLR-536.patch
>
>
> as we are using java5 .we can use annotations to bind SolrDocument to java 
> beans directly.
> This can make the usage of solrj a  bit simpler
> The QueryResponse class in solrj can have an extra method as follows
> public  List getResultBeans(Class klass)
> and the bean can have annotations as
> class MyBean{
> @Field("id") //name is optional
> String id;
> @Field("category")
> List categories
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-556) Highlighting of multi-valued fields returns snippets which span multiple different values

2008-06-04 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602541#action_12602541
 ] 

Mike Klaas commented on SOLR-556:
-

Ah, I see what the problem is:  Although it is impossible for tokens from 
different values to appear in the same fragment (due to the semantics of 
MultiValuedTokenFilter), the non-token text (typically, punctuation) from 
different values can bleed into the same fragment, since lucene's highlighter 
can only create a new fragment on token boundaries.

Unfortunately SOLR-553 was committed a day after you submitted your patch, and 
rearranges the code slightly so that it no longer applies.  Could you sync the 
patch with trunk?  I think the basic approach is sound.

> Highlighting of multi-valued fields returns snippets which span multiple 
> different values
> -
>
> Key: SOLR-556
> URL: https://issues.apache.org/jira/browse/SOLR-556
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
> Environment: Tomcat 5.5
>Reporter: Lars Kotthoff
>Assignee: Mike Klaas
>Priority: Minor
> Fix For: 1.3
>
> Attachments: solr-highlight-multivalued-example.xml, 
> solr-highlight-multivalued.patch
>
>
> When highlighting multi-valued fields, the highlighter sometimes returns 
> snippets which span multiple values, e.g. with values "foo" and "bar" and 
> search term "ba" the highlighter will create the snippet "foobar". 
> Furthermore it sometimes returns smaller snippets than it should, e.g. with 
> value "foobar" and search term "oo" it will create the snippet "oo" 
> regardless of hl.fragsize.
> I have been unable to determine the real cause for this, or indeed what 
> actually goes on at all. To reproduce the problem, I've used the following 
> steps:
> * create an index with multi-valued fields, one document should have at least 
> 3 values for these fields (in my case strings of length between 5 and 15 
> Japanese characters -- as far as I can tell plain old ASCII should produce 
> the same effect though)
> * search for part of a value in such a field with highlighting enabled, the 
> additional parameters I use are hl.fragsize=70, hl.requireFieldMatch=true, 
> hl.mergeContiguous=true (changing the parameters does not seem to have any 
> effect on the result though)
> * highlighted snippets should show effects described above

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-502) Add search time out support

2008-06-04 Thread Sean Timm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Timm updated SOLR-502:
---

Attachment: SOLR-502.patch

* Adds partialResults support to the binary response, which is used by 
distributed search.
* Really removes the System.out.println() this time.
* timeallowed param is now camelcase (timeAllowed).

> Add search time out support
> ---
>
> Key: SOLR-502
> URL: https://issues.apache.org/jira/browse/SOLR-502
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sean Timm
>Assignee: Otis Gospodnetic
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-502.patch, SOLR-502.patch, SOLR-502.patch, 
> solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, 
> solrTimeout.patch
>
>
> Uses LUCENE-997 to add time out support to Solr.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-502) Add search time out support

2008-06-04 Thread Sean Timm (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602550#action_12602550
 ] 

Sean Timm commented on SOLR-502:


Sorry about the timeallowed parameter.  For some reason I had in my head that 
the parameters were not supposed to be camel case and I only switched the 
parameter variable names.

You should be seeing a log message similar to:
{noformat}
WARNING: Query: title:s*; Elapsed time: 20Exceeded allowed search time: 1 ms.
{noformat}
even with the previous patch.  Though, when using distributed search, the new 
binary response is used which I hadn't modified to include partial results 
support.  It should work with this new patch.

{noformat}

 0
 39
 
  naan.office.aol.com:8973/solr,naan.office.aol.com:8993/solr
  on

  0
  headline:s*
  1
  2.2
  100
 



{noformat}

bq. If timeallowed=1, should I ever be seeing QTime over 1?

Yes, the TimeLimitedCollector can only interrupt searches during the collect() 
calls.  Other, sometimes substantial, work is done outside of the collect().

Also, see the note in the TimeLimitedCollector.setResolution(long) Javadocs 
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/TimeLimitedCollector.html#setResolution(long)

> Add search time out support
> ---
>
> Key: SOLR-502
> URL: https://issues.apache.org/jira/browse/SOLR-502
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sean Timm
>Assignee: Otis Gospodnetic
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-502.patch, SOLR-502.patch, SOLR-502.patch, 
> solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, 
> solrTimeout.patch
>
>
> Uses LUCENE-997 to add time out support to Solr.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-502) Add search time out support

2008-06-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602559#action_12602559
 ] 

Noble Paul commented on SOLR-502:
-

Sean: For the namedListCodec changes to be backward compatible (within 1.3) add 
check the list size before calling a _list.get()_
{code:java}
if(list.size() > 3)  solrDocs.setPartialResult((Boolean)list.get(3));
{code}

> Add search time out support
> ---
>
> Key: SOLR-502
> URL: https://issues.apache.org/jira/browse/SOLR-502
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sean Timm
>Assignee: Otis Gospodnetic
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-502.patch, SOLR-502.patch, SOLR-502.patch, 
> solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, solrTimeout.patch, 
> solrTimeout.patch
>
>
> Uses LUCENE-997 to add time out support to Solr.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.