[jira] Updated: (SOLR-1527) shareSChema does not work with absolute paths

2009-10-27 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1527:
-

Fix Version/s: (was: 1.4)
   1.5

too late for 1.4

> shareSChema does not work with absolute paths
> -
>
> Key: SOLR-1527
> URL: https://issues.apache.org/jira/browse/SOLR-1527
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1527.patch
>
>
> shareSchema does not work if schema is passed on as absolute
> mail thread http://markmail.org/thread/k6cztofj4gnrjhsh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
OK . Let us push it to 1.5.


On Wed, Oct 28, 2009 at 10:01 AM, Ryan McKinley  wrote:
>
> On Oct 28, 2009, at 12:07 AM, Chris Hostetter wrote:
>
>>
>> : It's not a regression, but a new, non-core feature.  If we delay every
>> : time we find a bug, this release will never end.
>>
>> agreed.
>>
>
> agreed.  And assuming lucene 3.0 comes out in the somewhat near future, we
> will have an easy place for minor bug fixes soon enough.
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Ryan McKinley


On Oct 28, 2009, at 12:07 AM, Chris Hostetter wrote:



: It's not a regression, but a new, non-core feature.  If we delay  
every

: time we find a bug, this release will never end.

agreed.



agreed.  And assuming lucene 3.0 comes out in the somewhat near  
future, we will have an easy place for minor bug fixes soon enough.





Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Chris Hostetter

: It's not a regression, but a new, non-core feature.  If we delay every
: time we find a bug, this release will never end.

agreed.



-Hoss



[jira] Commented: (SOLR-1527) shareSChema does not work with absolute paths

2009-10-27 Thread Jeremy Hinegardner (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770773#action_12770773
 ] 

Jeremy Hinegardner commented on SOLR-1527:
--

I just tested this out and it appears to work for me.  Can we have this 
committed to solr 1.4 ?

> shareSChema does not work with absolute paths
> -
>
> Key: SOLR-1527
> URL: https://issues.apache.org/jira/browse/SOLR-1527
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1527.patch
>
>
> shareSchema does not work if schema is passed on as absolute
> mail thread http://markmail.org/thread/k6cztofj4gnrjhsh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Yonik Seeley
2009/10/27 Noble Paul നോബിള്‍  नोब्ळ् :
> I shall fix https://issues.apache.org/jira/browse/SOLR-1527 for 1.4 I guess

Perhaps we should just document this?
It's not a regression, but a new, non-core feature.  If we delay every
time we find a bug, this release will never end.

-Yonik
http://www.lucidimagination.com


[jira] Assigned: (SOLR-1527) shareSChema does not work with absolute paths

2009-10-27 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-1527:


Assignee: Noble Paul

> shareSChema does not work with absolute paths
> -
>
> Key: SOLR-1527
> URL: https://issues.apache.org/jira/browse/SOLR-1527
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1527.patch
>
>
> shareSchema does not work if schema is passed on as absolute
> mail thread http://markmail.org/thread/k6cztofj4gnrjhsh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
I shall fix https://issues.apache.org/jira/browse/SOLR-1527 for 1.4 I guess

On Wed, Oct 28, 2009 at 8:20 AM, Erik Hatcher  wrote:
> +1 also
>
> I'll have a look at the duplicate libs issue as soon as I can, but won't be
> until after/during next week (ApacheCon).
>
>        Erik
>
> On Oct 27, 2009, at 10:41 PM, Chris Hostetter wrote:
>
>>
>> : OK, new artifacts are up.
>>
>> +1
>>
>> And for the record, these are the artifacts my vote is based on...
>>
>> 8166f7f23637fa8a7d84c3cd30aa21ab  apache-solr-1.4.0.tgz
>> f7ffa8669e12271981c212733bee1ec0  apache-solr-1.4.0.zip
>>
>>
>>
>>
>>
>> -Hoss
>>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


[jira] Updated: (SOLR-1527) shareSChema does not work with absolute paths

2009-10-27 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1527:
-

Attachment: SOLR-1527.patch

> shareSChema does not work with absolute paths
> -
>
> Key: SOLR-1527
> URL: https://issues.apache.org/jira/browse/SOLR-1527
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1527.patch
>
>
> shareSchema does not work if schema is passed on as absolute
> mail thread http://markmail.org/thread/k6cztofj4gnrjhsh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1527) shareSChema does not work with absolute paths

2009-10-27 Thread Noble Paul (JIRA)
shareSChema does not work with absolute paths
-

 Key: SOLR-1527
 URL: https://issues.apache.org/jira/browse/SOLR-1527
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Noble Paul
Priority: Minor
 Fix For: 1.4


shareSchema does not work if schema is passed on as absolute

mail thread http://markmail.org/thread/k6cztofj4gnrjhsh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Erik Hatcher

+1 also

I'll have a look at the duplicate libs issue as soon as I can, but  
won't be until after/during next week (ApacheCon).


Erik

On Oct 27, 2009, at 10:41 PM, Chris Hostetter wrote:



: OK, new artifacts are up.

+1

And for the record, these are the artifacts my vote is based on...

8166f7f23637fa8a7d84c3cd30aa21ab  apache-solr-1.4.0.tgz
f7ffa8669e12271981c212733bee1ec0  apache-solr-1.4.0.zip





-Hoss





Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Chris Hostetter

: OK, new artifacts are up.

+1  

And for the record, these are the artifacts my vote is based on...

8166f7f23637fa8a7d84c3cd30aa21ab  apache-solr-1.4.0.tgz
f7ffa8669e12271981c212733bee1ec0  apache-solr-1.4.0.zip





-Hoss



[jira] Issue Comment Edited: (SOLR-1516) DocumentList and Document QueryResponseWriter

2009-10-27 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770756#action_12770756
 ] 

Chris A. Mattmann edited comment on SOLR-1516 at 10/28/09 2:27 AM:
---

I haven't really heard any comments on this issue, and I've got the impression 
that not many folks write these QueryResponseWriters. To me, writing one was 
invaluable. The use case was:

* I make the choice to make SOLR the gold source for search index data (I'm 
dealing with planetary science and earth science data on 4-5 projects)
* I want to drive search but _also_ met output from SOLR (treating SOLR as a 
search web service, with customizable output [2])
* the default SOLR XML and the 5-7 output formats didn't do it for me since I 
have some specialized earth and planetary science use cases. E.g., on a few 
different projects, I need to be able to:
   * output FGDC XML (yes it's a standard for earth science metadata, and also 
relevant for the GeoSOLR stuff)
   * output custom RDF metadata 
   * output a particular style of JSON to plug in to some external web client, 
e.g., an auto-suggest that requires its own JSON format, not SOLR's

  To illustrate the reason that the 5-7 output formats didn't do it for me 
either, I'll use an example. There may be the sense of, "well why didn't I 
write some Java/Ruby/PHP/Python client that called SOLR and one of it's 
existing wt's and then output a custom format from your favorite programming 
language (PL)"? The reasons are three fold:

  1. SOLR advertises that the QueryResponseWriter interface is an official SOLR 
plugin and interface, at least according to:
  * the Wiki documentation [1]
  * the advertised published book on SOLR [2]
  * Chris Hostetter's ApacheCon08 slides as part of the core SOLR 
architecture in his 50K foot view diagram [3]

2. If SOLR is truly a search web service, and allows for changeable output 
formats (evidenced by exposing the wt parameter), then why force people to use 
one of the existing wt's and then ask them to transform (either via a PL, or 
via XSLT) instead of allowing them to natively generate the specific output 
format type?

3. Why make o.a.l.r.QueryResponseWriter an interface and not a concrete class 
if it is never intended to be implemented by others, or more importantly, is 
kind of non-intuitive to implement?

Besides 1-3 for me, I have external COTS and OTS tools that cannot be changed 
and that expect data to be loaded into them in a particular format, and I'd 
like to plug them into SOLR and the easiest way for me to do that is with a 
curl/wget type operation and then a pipe into the COTS/OTS tool, and wt's are 
the way to go for that.

So, given the above, when I went to write a "wt" I was surprised how hard it 
was for me to understand the NamedList structure which is just a bag of objects 
that you have to unpack with unfriendly instanceof checks and recursive 
unmarshalling (walking the NamedList tree). All I wanted for my wt was to be 
able to format the output Document List or on a Doc-by-doc basis. 

Anyways just wanted to provide some further fodder and discussion for this 
issue. To me this is important, and clearly, based on [1-3], 
QueryResponseWriters by definition seem to be a big piece of the SOLR 
architecture.


Chris

---
[1] http://wiki.apache.org/solr/QueryResponseWriter
[2] 
http://people.apache.org/~hossman/apachecon2008us/btb/apache-solr-beyond-the-box.pdf
 
[3] SOLR 1.4 Enterprise Search Server, Packt Publishing, 2009.



  was (Author: chrismattmann):
I haven't really heard any comments on this issue, and I've got the 
impression that not many folks write these QueryResponseWriters. To me, writing 
one was invaluable. The use case was:

* I make the choice to make SOLR the gold source for search index data (I'm 
dealing with planetary science and earth science data on 4-5 projects)
* I want to drive search but _also_ met output from SOLR (treating SOLR as a 
search web service, with customizable output [2])
* the default SOLR XML and the 5-7 output formats didn't do it for me since I 
have some specialized earth and planetary science use cases. E.g., on a few 
different projects, I need to be able to:
   * output FGDC XML (yes it's a standard for earth science metadata, and also 
relevant for the GeoSOLR stuff)
   * output custom RDF metadata 
   * output a particular style of JSON to plug in to some external web client, 
e.g., an auto-suggest that requires its own JSON format, not SOLR's

  To illustrate the reason that the 5-7 output formats didn't do it for me 
either, I'll use an example. There may be the sense of, "well why didn't I 
write some Java/Ruby/PHP/Python client that called SOLR and one of it's 
existing wt's and then output a custom format from your favorite programming 
language (PL)"? The reasons are three fold:

  1. SOLR advertises that the Qu

[jira] Commented: (SOLR-1516) DocumentList and Document QueryResponseWriter

2009-10-27 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770756#action_12770756
 ] 

Chris A. Mattmann commented on SOLR-1516:
-

I haven't really heard any comments on this issue, and I've got the impression 
that not many folks write these QueryResponseWriters. To me, writing one was 
invaluable. The use case was:

* I make the choice to make SOLR the gold source for search index data (I'm 
dealing with planetary science and earth science data on 4-5 projects)
* I want to drive search but _also_ met output from SOLR (treating SOLR as a 
search web service, with customizable output [2])
* the default SOLR XML and the 5-7 output formats didn't do it for me since I 
have some specialized earth and planetary science use cases. E.g., on a few 
different projects, I need to be able to:
   * output FGDC XML (yes it's a standard for earth science metadata, and also 
relevant for the GeoSOLR stuff)
   * output custom RDF metadata 
   * output a particular style of JSON to plug in to some external web client, 
e.g., an auto-suggest that requires its own JSON format, not SOLR's

  To illustrate the reason that the 5-7 output formats didn't do it for me 
either, I'll use an example. There may be the sense of, "well why didn't I 
write some Java/Ruby/PHP/Python client that called SOLR and one of it's 
existing wt's and then output a custom format from your favorite programming 
language (PL)"? The reasons are three fold:

  1. SOLR advertises that the QueryResponseWriter interface is an official SOLR 
plugin and interface, at least according to:
  * the Wiki documentation [1]
  * the advertised published book on SOLR [2]
  * Chris Hostetter's ApacheCon08 slides as part of the core SOLR 
architecture in his 50K foot view diagram [3]
  2. If SOLR is truly a search web service, and allows for changeable output 
formats (evidenced by exposing the wt parameter), then why force people to use 
one of the existing wt's and then ask them to transform (either via a PL, or 
via XSLT) instead of allowing them to natively generate the specific output 
format type?
  3. Why make o.a.l.r.QueryResponseWriter an interface and not a concrete class 
if it is never intended to be implemented by others, or more importantly, is 
kind of non-intuitive to implement?

Besides 1-3 for me, I have external COTS and OTS tools that cannot be changed 
and that expect data to be loaded into them in a particular format, and I'd 
like to plug them into SOLR and the easiest way for me to do that is with a 
curl/wget type operation and then a pipe into the COTS/OTS tool, and wt's are 
the way to go for that.

So, given the above, when I went to write a "wt" I was surprised how hard it 
was for me to understand the NamedList structure which is just a bag of objects 
that you have to unpack with unfriendly instanceof checks and recursive 
unmarshalling (walking the NamedList tree). All I wanted for my wt was to be 
able to format the output Document List or on a Doc-by-doc basis. 

Anyways just wanted to provide some further fodder and discussion for this 
issue. To me this is important, and clearly, based on [1-3], 
QueryResponseWriters by definition seem to be a big piece of the SOLR 
architecture.


Chris

---
[1] http://wiki.apache.org/solr/QueryResponseWriter
[2] 
http://people.apache.org/~hossman/apachecon2008us/btb/apache-solr-beyond-the-box.pdf
 
[3] SOLR 1.4 Enterprise Search Server, Packt Publishing, 2009.



> DocumentList and Document QueryResponseWriter
> -
>
> Key: SOLR-1516
> URL: https://issues.apache.org/jira/browse/SOLR-1516
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
> Environment: My MacBook Pro laptop.
>Reporter: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1516.Mattmann.101809.patch.txt
>
>
> I tried to implement a custom QueryResponseWriter the other day and was 
> amazed at the level of unmarshalling and weeding through objects that was 
> necessary just to format the output o.a.l.Document list. As a user, I wanted 
> to be able to implement either 2 functions:
> * process a document at a time, and format it (for speed/efficiency)
> * process all the documents at once, and format them (in case an aggregate 
> calculation is necessary for outputting)
> So, I've decided to contribute 2 simple classes that I think are sufficiently 
> generic and reusable. The first is o.a.s.request.DocumentResponseWriter -- it 
> handles the first bullet above. The second is 
> o.a.s.request.DocumentListResponseWriter. Both are abstract base classes and 
> require the user to implement either an #emitDoc function (in the case of 
> bullet 1), or an #emitDocList function (in the case of 

Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Grant Ingersoll

OK, new artifacts are up.



On Oct 27, 2009, at 9:51 PM, Chris Hostetter wrote:



: By other issues, I mean the duplicate libs.  I think we can live  
with them.


I agree ... i would have voted +1 already but i figured based on your
commit you were in the process of doing a new one.



-Hoss





Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Chris Hostetter

: By other issues, I mean the duplicate libs.  I think we can live with them.

I agree ... i would have voted +1 already but i figured based on your 
commit you were in the process of doing a new one.



-Hoss



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Grant Ingersoll


On Oct 27, 2009, at 9:08 PM, Grant Ingersoll wrote:



On Oct 27, 2009, at 7:31 PM, Chris Hostetter wrote:



: Hmm, I thought I removed that one.  beta5 should not be in there.

I see you changed this on the branch .. were you planning on making  
a new

release candidate?


I'll cut another one.  Although I'm not fixing the other issues.


By other issues, I mean the duplicate libs.  I think we can live with  
them.


Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Grant Ingersoll


On Oct 27, 2009, at 7:31 PM, Chris Hostetter wrote:



: Hmm, I thought I removed that one.  beta5 should not be in there.

I see you changed this on the branch .. were you planning on making  
a new

release candidate?


I'll cut another one.  Although I'm not fixing the other issues. 


Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Chris Hostetter

: Hmm, I thought I removed that one.  beta5 should not be in there.

I see you changed this on the branch .. were you planning on making a new 
release candidate?


-Hoss



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Grant Ingersoll


On Oct 27, 2009, at 3:14 PM, Chris Hostetter wrote:



: OK, take two is up in the same place.  Please vote.

I thought we'd fixed this, but we still several dependency jars  
included

multiple times in the release artifacts...

hoss...@brunner:~/tmp/solr1.4$ find apache-solr-1.4.0 -name \*.jar |  
perl -ple 's{.*/}{}' | sort | uniq -c | sort -r | perl -nle 'print  
unless /^\s*1\s/'

  3 commons-io-1.4.jar
  3 commons-codec-1.3.jar
  2 wstx-asl-3.2.7.jar
  2 slf4j-api-1.5.5.jar
  2 log4j-1.2.14.jar
  2 jcl-over-slf4j-1.5.5.jar
  2 geronimo-stax-api_1.0_spec-1.0.1.jar
  2 commons-lang-2.4.jar
  2 commons-httpclient-3.1.jar

...by itself i wouldn't let that stop us from releasing, but what does
concern me a little bit is that we're also including different  
versions of
some jars, which seems like it could easily cause some weird errors  
for

people in some situations...

Using extraction combined with clustering and/or velocity...
apache-solr-1.4.0/contrib/extraction/lib/commons-lang-2.1.jar
apache-solr-1.4.0/contrib/clustering/lib/commons-lang-2.4.jar
apache-solr-1.4.0/contrib/velocity/src/main/solr/lib/commons- 
lang-2.4.jar


Using extraction in solr...
apache-solr-1.4.0/dist/solrj-lib/geronimo-stax-api_1.0_spec-1.0.1.jar
apache-solr-1.4.0/lib/geronimo-stax-api_1.0_spec-1.0.1.jar
apache-solr-1.4.0/contrib/extraction/lib/geronimo-stax- 
api_1.0_spec-1.0.jar


Using extraction ... at all ...
apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-3.5-beta5.jar
apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-3.5-beta6.jar



Hmm, I thought I removed that one.  beta5 should not be in there.

This one should actually cause any problems, since the 2.5 copy is  
just

what jetty uses...
apache-solr-1.4.0/lib/servlet-api-2.4.jar
apache-solr-1.4.0/example/lib/servlet-api-2.5-6.1.3.jar


...does anyone think these are likely to be problematic enough that we
they *must* be fixed before releasing?

?

(At a minimum: can someone who is familiar with the velocity code  
before
test it out in combination with extraction? ... i'm not even sure  
what to

look for)


-Hoss



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Chris Hostetter

: OK, take two is up in the same place.  Please vote.

I thought we'd fixed this, but we still several dependency jars included 
multiple times in the release artifacts...

hoss...@brunner:~/tmp/solr1.4$ find apache-solr-1.4.0 -name \*.jar | perl -ple 
's{.*/}{}' | sort | uniq -c | sort -r | perl -nle 'print unless /^\s*1\s/'
   3 commons-io-1.4.jar
   3 commons-codec-1.3.jar
   2 wstx-asl-3.2.7.jar
   2 slf4j-api-1.5.5.jar
   2 log4j-1.2.14.jar
   2 jcl-over-slf4j-1.5.5.jar
   2 geronimo-stax-api_1.0_spec-1.0.1.jar
   2 commons-lang-2.4.jar
   2 commons-httpclient-3.1.jar

...by itself i wouldn't let that stop us from releasing, but what does 
concern me a little bit is that we're also including different versions of 
some jars, which seems like it could easily cause some weird errors for 
people in some situations...

Using extraction combined with clustering and/or velocity...
apache-solr-1.4.0/contrib/extraction/lib/commons-lang-2.1.jar
apache-solr-1.4.0/contrib/clustering/lib/commons-lang-2.4.jar
apache-solr-1.4.0/contrib/velocity/src/main/solr/lib/commons-lang-2.4.jar

Using extraction in solr...
apache-solr-1.4.0/dist/solrj-lib/geronimo-stax-api_1.0_spec-1.0.1.jar
apache-solr-1.4.0/lib/geronimo-stax-api_1.0_spec-1.0.1.jar
apache-solr-1.4.0/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.jar

Using extraction ... at all ...
apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-3.5-beta5.jar
apache-solr-1.4.0/contrib/extraction/lib/poi-ooxml-3.5-beta6.jar

This one should actually cause any problems, since the 2.5 copy is just 
what jetty uses...
apache-solr-1.4.0/lib/servlet-api-2.4.jar
apache-solr-1.4.0/example/lib/servlet-api-2.5-6.1.3.jar


...does anyone think these are likely to be problematic enough that we 
they *must* be fixed before releasing?   

?

(At a minimum: can someone who is familiar with the velocity code before 
test it out in combination with extraction? ... i'm not even sure what to 
look for)


-Hoss



[jira] Commented: (SOLR-1283) Mark Invalid error on indexing

2009-10-27 Thread David Bowen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770586#action_12770586
 ] 

David Bowen commented on SOLR-1283:
---

It seems to me that the code should bail out and just assume that a "<" did not 
begin an HTML tag if it still isn't sure after reading the DEFAULT_READ_AHEAD 
(8,192) characters.  It looks like the code was intended to do that (see the 
checks against safeReadAheadLimit) but must be missing some case.



> Mark Invalid error on indexing
> --
>
> Key: SOLR-1283
> URL: https://issues.apache.org/jira/browse/SOLR-1283
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Ubuntu 8.04, Sun Java 6
>Reporter: solrize
>
> When indexing large (1 megabyte) documents I get a lot of exceptions with 
> stack traces like the below.  It happens both in the Solr 1.3 release and in 
> the July 9 1.4 nightly.  I believe this to NOT be the same issue as SOLR-42.  
> I found some further discussion on solr-user: 
> http://www.nabble.com/IOException:-Mark-invalid-while-analyzing-HTML-td17052153.html
>  
> In that discussion, Grant asked the original poster to open a Jira issue, but 
> I didn't see one so I'm opening one; please feel free to merge or close if 
> it's redundant. 
> My stack trace follows.
> Jul 15, 2009 8:36:42 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={} status=500 QTime=3 
> Jul 15, 2009 8:36:42 AM org.apache.solr.common.SolrException log
> SEVERE: java.io.IOException: Mark invalid
> at java.io.BufferedReader.reset(BufferedReader.java:485)
> at 
> org.apache.solr.analysis.HTMLStripReader.restoreState(HTMLStripReader.java:171)
> at 
> org.apache.solr.analysis.HTMLStripReader.read(HTMLStripReader.java:728)
> at 
> org.apache.solr.analysis.HTMLStripReader.read(HTMLStripReader.java:742)
> at java.io.Reader.read(Reader.java:123)
> at 
> org.apache.lucene.analysis.CharTokenizer.next(CharTokenizer.java:108)
> at org.apache.lucene.analysis.StopFilter.next(StopFilter.java:178)
> at 
> org.apache.lucene.analysis.standard.StandardFilter.next(StandardFilter.java:84)
> at 
> org.apache.lucene.analysis.LowerCaseFilter.next(LowerCaseFilter.java:53)
> at 
> org.apache.solr.analysis.WordDelimiterFilter.next(WordDelimiterFilter.java:347)
> at 
> org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:159)
> at 
> org.apache.lucene.index.DocFieldConsumersPerField.processFields(DocFieldConsumersPerField.java:36)
> at 
> org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:234)
> at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:765)
> at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:748)
>   at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2512)
>   at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2484)
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:240)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
>   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:140)
>   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1292)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>   at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>   at org.mortbay.jetty.Server.handle(Server.java:285)
>   at 
> org.mortbay.jetty.Http

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-10-27 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769878#action_12769878
 ] 

Martijn van Groningen edited comment on SOLR-236 at 10/27/09 4:34 PM:
--

I have attached a new patch which includes a major refactoring which makes the 
code more flexible and cleaner. The patch also includes a new aggregate 
functionality and a bug fix.

h3. Aggregate function and bug fix
The new patch allows you to execute aggregate functions on the collapsed 
documents (for example sum the stock amount or calculating the minimum price of 
a collapsed group). Currently there are four aggregate functions available: 
sum(), min(), max() and avg(). To execute one or more functions the 
_collapse.aggregate_ parameter has to be added to the request url. The 
parameter expects the following syntax: _function_name(field_name)[, 
function_name(field_name)]_. For example: collapse.aggregate=sum(stock), 
min(price) and might have a result like this:
{code:xml}

   
  10
  ...
   
   
  5.99
  ...
   

{code}

The patch also fixes a bug inside the {{NonAdjacentDocumentCollapser}} that was 
reported on the solr-user mailing list a few days ago. An index out of bounds 
exception was thrown when documents were removed from an index and a field 
collapse search was done afterwards.  

h3. Code refactoring
The code refactoring includes the following things:
* The notion of a {{CollapseGroup}}. A collapse group defines what an unique 
group is in the search result. For the adjacent and non adjacent document 
collapser this is different. For adjacent field collapsing a group is defined 
by its field value and the document id of the most relevant document in that 
group. More then one collapse group may have the same fieldvalue. For normal 
field collapsing (non adjacent) the group is defined just by the field value. 
* The notion of a {{CollapseCollector}} that receives the collapsed documents 
from a {{DocumentCollector}} and does something with it. For example keeps a 
count of how many documents were collapsed per collapse group or computes an 
average of a certain field like price. As you can see in the code instead of 
using field values or document ids a collapse group is used for identifying a 
collapse group.
{code}
/**
 * A CollapseCollector is responsible for receiving collapse 
callbacks from the DocumentCollapser.
 * An implementation can choose what to do with the received callbacks and 
data. Whatever an implementation collects it
 * is responsible for adding its results to the response.
 *
 * Implementation of this interface don't need to be thread safe!
 */
public interface CollapseCollector {

  /**
   * Informs the CollapseCollector that a document has been 
collapsed under the specified collapseGroup.
   *
   * @param docId The id of the document that has been collasped
   * @param collapseGroup The collapse group the docId has been collapsed under
   * @param collapseContext The collapse context
   */
  void documentCollapsed(int docId, CollapseGroup collapseGroup, 
CollapseContext collapseContext);

  /**
   * Informs the CollapseCollector about the document head.
   * The document head is the most relevant id for the specified collapseGroup.
   *
   * @param docHeadId The identifier of the document head
   * @param collapseGroup The collapse group of the document head
   * @param collapseContext The collapse context
   */
  void documentHead(int docHeadId, CollapseGroup collapseGroup, CollapseContext 
collapseContext);

  /**
   * Adds the CollapseCollector implementation specific result 
data to the result.
   *
   * @param result The response result 
   * @param docs The documents to be added to the response
   * @param collapseContext The collapse context
   */
  void getResult(NamedList result, DocList docs, CollapseContext 
collapseContext);

}
{code}
There is also a {{CollapseContext}} that allows you store data that can be 
shared between {{CollapseCollectors}}. 
* A {{CollapseCollectorFactory}} is responsible for creating a 
{{CollepseCollector}}. It does this based on the {{SolrQueryRequest}}. All the 
logic for when to enable a certain {{CollapseCollector}} must be placed in the 
factory. 
{code}
/**
 * A concrete CollapseCollectorFactory implementation is 
responsible for creating {...@link CollapseCollector}
 * instances based on the {...@link SolrQueryRequest}.
 */
public interface CollapseCollectorFactory {

  /**
   * Creates an instance of a CollapseCollector specified by the concrete 
subclass.
   * The concrete subclass decides based on the specified request if an new 
instance has to be created and
   * can return null for that matter.
   * 
   * @param request The specified request
   * @return an instance of a CollapseCollector or null
   */
  CollapseCollector createCollapseCollector(SolrQueryRequest request);

}
{code}
Curren

[jira] Updated: (SOLR-236) Field collapsing

2009-10-27 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated SOLR-236:
---

Attachment: field-collapse-5.patch

I have updated the patch that fixes the bug that was reported yesterday on the 
solr-user mailing list:
{quote}
found another exception, i cant find specific steps to reproduce
besides starting with an unfiltered result and then given an int field
with values (1,2,3) filtering by 3 triggers it sometimes, this is in
an index with very frequent updates and deletes


--joe


java.lang.NullPointerException
   at 
org.apache.solr.search.fieldcollapse.collector.FieldValueCountCollapseCollectorFactory
$FieldValueCountCollapseCollector.getResult(FieldValueCountCollapseCollectorFactory.java:84)
   at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.getCollapseInfo(AbstractDocumentCollapser.java:191)
   at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:179)
   at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:121)
   at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
   at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
   at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
   at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
   at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
   at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
   at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)
{quote}

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, 
> SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collap

[jira] Created: (SOLR-1526) Client Side Tika integration

2009-10-27 Thread Grant Ingersoll (JIRA)
Client Side Tika integration


 Key: SOLR-1526
 URL: https://issues.apache.org/jira/browse/SOLR-1526
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Reporter: Grant Ingersoll
Priority: Minor


Often times it is cost prohibitive to send full, rich documents over the wire.  
The contrib/extraction library has server side integration with Tika, but it 
would be nice to have a client side implementation as well.  It should support 
both metadata and content or just metadata.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1526) Client Side Tika integration

2009-10-27 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1526:
--

Fix Version/s: 1.5

> Client Side Tika integration
> 
>
> Key: SOLR-1526
> URL: https://issues.apache.org/jira/browse/SOLR-1526
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Reporter: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> Often times it is cost prohibitive to send full, rich documents over the 
> wire.  The contrib/extraction library has server side integration with Tika, 
> but it would be nice to have a client side implementation as well.  It should 
> support both metadata and content or just metadata.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Solr 1.4.0

2009-10-27 Thread Grant Ingersoll
I'm feeling emptiness, too.  Looks like I forgot the public_html part  
when copying.  All is restored now.



On Oct 26, 2009, at 10:14 PM, Yonik Seeley wrote:

On Mon, Oct 26, 2009 at 9:58 PM, Grant Ingersoll  
 wrote:

OK, take two is up in the same place.  Please vote.


I'm seeing emptiness at
http://people.apache.org/~gsingers/solr/1.4.0/

-Yonik
http://www.lucidimagination.com



On Oct 26, 2009, at 6:15 PM, Grant Ingersoll wrote:


Tis the season for releases...

Please vote on releasing the Solr 1.4.0 artifacts located at
http://people.apache.org/~gsingers/solr/1.4.0/  (note, solr.tar and
solr-maven.tar are not artifacts to be released)

CHANGES are spelled out at
https://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.4/CHANGES.txt

Thanks,
Grant








[jira] Created: (SOLR-1525) allow DIH to refer to core properties

2009-10-27 Thread Noble Paul (JIRA)
allow DIH to refer to core properties
-

 Key: SOLR-1525
 URL: https://issues.apache.org/jira/browse/SOLR-1525
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Priority: Minor
 Fix For: 1.4


DIH dataConfig cannot directly use the core properties . This is the only way 
http://wiki.apache.org/solr/DataImportHandlerFaq#Is_it_possible_to_use_core_properties_inside_data-config_xml.3F



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1525) allow DIH to refer to core properties

2009-10-27 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1525:
-

Fix Version/s: (was: 1.4)
   1.5
 Assignee: Noble Paul

> allow DIH to refer to core properties
> -
>
> Key: SOLR-1525
> URL: https://issues.apache.org/jira/browse/SOLR-1525
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.5
>
>
> DIH dataConfig cannot directly use the core properties . This is the only way 
> http://wiki.apache.org/solr/DataImportHandlerFaq#Is_it_possible_to_use_core_properties_inside_data-config_xml.3F

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Dinamic field name with Data import handler

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Oct 22, 2009 at 7:51 PM, Renata Mota
 wrote:
> Hi,
>
>
>
> I’m trying to give dynamic names for a field with data import handler, but i
> don’t get.
>
>
>
> Example:
>
>
>
> 
>
> 

hey, this is supposed to work. is it because there is a space in the
name attribute ?

>
> 
>
>
>
> It’s possible to do something like this?
>
>
>
>
>
> Thanks,
>
>
>
>
>
> Renata Gonçalves Mota
>   renata.m...@accurate.com.br
> Tel.: 55 11 3522-7723 R.3018
>
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Build failed in Hudson: Solr-trunk #968

2009-10-27 Thread Apache Hudson Server
See 

Changes:

[gsingers] 2.9.1 references

[gsingers] Update/remove 1.4-dev references.

[gsingers] projected release date.

[yonik] upgrade to Lucene 2.9.1 final RC2 (r829889 on 2.9 branch)

[yonik] tutorial - prepend display URLs with ... when significant parameters 
have been left out

[yonik] add warning about indexDefaults, let mainIndex.maxFieldLength inherit

[yonik] fix tutorial spelling mistake

--
[...truncated 2201 lines...]
[junit] Running org.apache.solr.analysis.TestSynonymMap
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 3.784 sec
[junit] Running org.apache.solr.analysis.TestTrimFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.287 sec
[junit] Running org.apache.solr.analysis.TestWordDelimiterFilter
[junit] Tests run: 14, Failures: 0, Errors: 0, Time elapsed: 26.524 sec
[junit] Running org.apache.solr.client.solrj.SolrExceptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.622 sec
[junit] Running org.apache.solr.client.solrj.SolrQueryTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.383 sec
[junit] Running org.apache.solr.client.solrj.TestBatchUpdate
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 26.487 sec
[junit] Running org.apache.solr.client.solrj.TestLBHttpSolrServer
[junit] Tests run: 2, Failures: 1, Errors: 0, Time elapsed: 14.833 sec
[junit] Test org.apache.solr.client.solrj.TestLBHttpSolrServer FAILED
[junit] Running org.apache.solr.client.solrj.beans.TestDocumentObjectBinder
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.648 sec
[junit] Running org.apache.solr.client.solrj.embedded.JettyWebappTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 14.067 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.LargeVolumeBinaryJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 9.992 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.698 sec
[junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.497 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.481 sec
[junit] Running org.apache.solr.client.solrj.embedded.MultiCoreEmbeddedTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.425 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.559 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 21.528 sec
[junit] Running org.apache.solr.client.solrj.embedded.SolrExampleJettyTest
[junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 38.837 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 48.554 sec
[junit] Running org.apache.solr.client.solrj.embedded.TestSolrProperties
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.925 sec
[junit] Running org.apache.solr.client.solrj.request.TestUpdateRequestCodec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.978 sec
[junit] Running 
org.apache.solr.client.solrj.response.AnlysisResponseBaseTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.407 sec
[junit] Running 
org.apache.solr.client.solrj.response.DocumentAnalysisResponseTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.374 sec
[junit] Running 
org.apache.solr.client.solrj.response.FieldAnalysisResponseTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.417 sec
[junit] Running org.apache.solr.client.solrj.response.QueryResponseTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.178 sec
[junit] Running org.apache.solr.client.solrj.response.TestSpellCheckResponse
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 15.101 sec
[junit] Running org.apache.solr.client.solrj.util.ClientUtilsTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.391 sec
[junit] Running org.apache.solr.common.SolrDocumentTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.383 sec
[junit] Running org.apache.solr.common.params.ModifiableSolrParamsTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.656 sec
[junit] Running org.apache.solr.common.params.SolrParamTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.825 sec
[junit] Running org.apach

[jira] Created: (SOLR-1524) proposal on distributed search

2009-10-27 Thread johnson.hong (JIRA)
proposal on distributed search
--

 Key: SOLR-1524
 URL: https://issues.apache.org/jira/browse/SOLR-1524
 Project: Solr
  Issue Type: Sub-task
  Components: clients - java
Reporter: johnson.hong


Hi,all.
some days ago,I put a question  that "why it gets slower while keep on 
increasing the start value accross distributed search ?".
And one reply by Shalin Shekhar Mangar was "distributed  search fetches 
start+rows documents from each shard in order to correctly merge the results".
After this I read the source code,and I found  query across distributed 
search would  be separated into nshards' query.
Each shard query will be excute as follows:
1.get ids of matched documents into DocList//take little time
2.get all documents by id which result in step 1.  //take little time
3.write all the documents found to binary string
4.parse the binary string back to SolrDocumentList  // step 4 take 
almost all the time used
 From above,I propose :It is not necessary to execute step 3 and step 4 even 
step 2,the ids found is enough to merge results,isn't it?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.