[jira] Updated: (SOLR-1352) DIH: MultiThreaded

2009-11-13 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1352:
-

Attachment: SOLR-1352.patch

first cut an ugly patch. a lot of work left before putting it in

> DIH: MultiThreaded
> --
>
> Key: SOLR-1352
> URL: https://issues.apache.org/jira/browse/SOLR-1352
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.5
>
> Attachments: SOLR-1352.patch
>
>
> It has been a long pending request to make DIH multithreaded. Now that we 
> have implemented most of the features , the next best thing we can aim for is 
> performance. DIH should be able to take advantage of multiple cores in a box 
> .I expect the configuration to be as follows
> {code:xml}
> 
> 
> 
> {code}
> at the entity where the numThreads is specified it should fork into multiple 
> threads. If the numThreads<2 it executes w/o forking. In debug mode it 
> automatically becomes singlethreaded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1352) DIH: MultiThreaded

2009-11-13 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1352:
-

Description: 
It has been a long pending request to make DIH multithreaded. Now that we have 
implemented most of the features , the next best thing we can aim for is 
performance. DIH should be able to take advantage of multiple cores in a box .I 
expect the configuration to be as follows

{code:xml}



{code}

at the entity where the threads is specified it should fork into multiple 
threads. If the threads<2 it executes w/o forking. In debug mode it 
automatically becomes singlethreaded.

  was:
It has been a long pending request to make DIH multithreaded. Now that we have 
implemented most of the features , the next best thing we can aim for is 
performance. DIH should be able to take advantage of multiple cores in a box .I 
expect the configuration to be as follows

{code:xml}



{code}

at the entity where the numThreads is specified it should fork into multiple 
threads. If the numThreads<2 it executes w/o forking. In debug mode it 
automatically becomes singlethreaded.


'numThreads' becomes' threads'

> DIH: MultiThreaded
> --
>
> Key: SOLR-1352
> URL: https://issues.apache.org/jira/browse/SOLR-1352
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.5
>
> Attachments: SOLR-1352.patch
>
>
> It has been a long pending request to make DIH multithreaded. Now that we 
> have implemented most of the features , the next best thing we can aim for is 
> performance. DIH should be able to take advantage of multiple cores in a box 
> .I expect the configuration to be as follows
> {code:xml}
> 
> 
> 
> {code}
> at the entity where the threads is specified it should fork into multiple 
> threads. If the threads<2 it executes w/o forking. In debug mode it 
> automatically becomes singlethreaded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2009-11-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-1561:
-

Assignee: Grant Ingersoll

> Import Lucene 2.9.1 Geospatial JAR
> --
>
> Key: SOLR-1561
> URL: https://issues.apache.org/jira/browse/SOLR-1561
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
>
> Bring in the spatial contrib jar so that we can use it's utilities, etc. 
> where appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2009-11-13 Thread Grant Ingersoll (JIRA)
Import Lucene 2.9.1 Geospatial JAR
--

 Key: SOLR-1561
 URL: https://issues.apache.org/jira/browse/SOLR-1561
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Priority: Minor


Bring in the spatial contrib jar so that we can use it's utilities, etc. where 
appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2009-11-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1561:
--

Fix Version/s: 1.5

> Import Lucene 2.9.1 Geospatial JAR
> --
>
> Key: SOLR-1561
> URL: https://issues.apache.org/jira/browse/SOLR-1561
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> Bring in the spatial contrib jar so that we can use it's utilities, etc. 
> where appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Work started: (SOLR-773) Incorporate Local Lucene/Solr

2009-11-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on SOLR-773 started by Grant Ingersoll.

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
> lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, 
> SOLR-773.patch, solrGeoQuery.tar, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2009-11-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-1561.
---

Resolution: Fixed

> Import Lucene 2.9.1 Geospatial JAR
> --
>
> Key: SOLR-1561
> URL: https://issues.apache.org/jira/browse/SOLR-1561
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> Bring in the spatial contrib jar so that we can use it's utilities, etc. 
> where appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-11-13 Thread Gijs Kunze (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777549#action_12777549
 ] 

Gijs Kunze commented on SOLR-773:
-

I've written a Solr plugin which uses a field with the computed hilbert space 
filling curve to cluster resulting documents so they can be efficiently placed 
on a google map control. Basically given a precision and a southwest lat/lng 
and northeast lat/lng bounding box it returns a group of clusters with an exact 
lat/lng location, a bounding box for all the documents in the cluster and the 
count of the number of documents in that cluster. Depending on settings given 
to the application (number of results in docset and/or size of the requested 
bounding box) it will instead to return the list of documents so that when 
you're zoomed in far enough the clusters transform into actual distinct 
documents.

My implementation is very specific to our website and is not generally 
applicable:
 - The calculation of the hilbert space filling curve value is done by our 
index-script
 - Several field names are hardcoded
 - It uses a hardcoded precision for the hilbert value (30 bits)
 - It still uses highly inefficient methods for some actions (it stores the 
value in a sint field instead of a trie int as I was waiting for Solr 1.4 to be 
released before continuing  working on the plugin, but now I'll have to 
find/make the time)

I think LocalSolr would really benefit from something like this as I think when 
you're storing geographic data displaying it on a map (whether it be google 
maps, bing maps, open streetview or whatever) is something a lot of people will 
want to do (and I love full faceted browsing on a map).

My implementation can be seen running on: 
http://www.mysecondhome.co.uk/search.html?view=map (It's not perfect, there are 
small bugs but in general it works fast enough on our dataset)

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
> lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, 
> SOLR-773.patch, solrGeoQuery.tar, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-11-13 Thread Thomas Woodard (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777582#action_12777582
 ] 

Thomas Woodard commented on SOLR-236:
-

I'm trying to get field collapsing to work against the 1.4.0 release. I applied 
the latest patch, moved the file, did a clean build, and set up a config based 
on the example. If I run a search without collapsing everything is fine, but if 
it actually tries to collapse, I get the following error:

java.lang.NoSuchMethodError: 
org.apache.solr.search.SolrIndexSearcher.getDocSet(Lorg/apache/lucene/search/Query;Lorg/apache/solr/search/DocSet;Lorg/apache/solr/search/DocSetAwareCollector;)Lorg/apache/solr/search/DocSet;
at 
org.apache.solr.search.fieldcollapse.NonAdjacentDocumentCollapser.doQuery(NonAdjacentDocumentCollapser.java:60)
at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:168)
at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:160)
at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:121)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)

The tricky part is that the method is there in the source and I wrote a little 
test JSP that can find it just fine. That implies a class loader issue of some 
sort, but I'm not seeing it. Any help would be greatly appreciated.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll
Implementing my first function (distance stuff) and notices that functions seem 
to have a float bent to them.  Not even sure what would be involved, but there 
are cases for distance that I could see wanting double precision.  Thoughts?

-Grant

Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
> Implementing my first function (distance stuff) and notices that functions 
> seem to have a float bent to them.  Not even sure what would be involved, but 
> there are cases for distance that I could see wanting double precision.  
> Thoughts?


It's an issue in general.

But for something like gdist(point_a,point_b), the internal
calculations can be done in double precision and if the result is cast
to a float at the end, it should be good enough for most uses, right?

-Yonik
http://www.lucidimagination.com


Re: Functions, floats and doubles

2009-11-13 Thread Walter Underwood
Float is almost never good enough. The loss of precision is horrific.

wunder

On Nov 13, 2009, at 9:58 AM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
>> Implementing my first function (distance stuff) and notices that functions 
>> seem to have a float bent to them.  Not even sure what would be involved, 
>> but there are cases for distance that I could see wanting double precision.  
>> Thoughts?
> 
> 
> It's an issue in general.
> 
> But for something like gdist(point_a,point_b), the internal
> calculations can be done in double precision and if the result is cast
> to a float at the end, it should be good enough for most uses, right?
> 
> -Yonik
> http://www.lucidimagination.com
> 



[jira] Updated: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2009-11-13 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-1561:
---

Description: Bring in the spatial contrib jar so that we can use its 
utilities, etc. where appropriate.  (was: Bring in the spatial contrib jar so 
that we can use it's utilities, etc. where appropriate.)

grammar policing

> Import Lucene 2.9.1 Geospatial JAR
> --
>
> Key: SOLR-1561
> URL: https://issues.apache.org/jira/browse/SOLR-1561
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> Bring in the spatial contrib jar so that we can use its utilities, etc. where 
> appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 12:58 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 12:52 PM, Grant Ingersoll  wrote:
>> Implementing my first function (distance stuff) and notices that functions 
>> seem to have a float bent to them.  Not even sure what would be involved, 
>> but there are cases for distance that I could see wanting double precision.  
>> Thoughts?
> 
> 
> It's an issue in general.

Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
pseudo fields and other capabilities, it would be nice to have doubles.

> 
> But for something like gdist(point_a,point_b), the internal
> calculations can be done in double precision and if the result is cast
> to a float at the end, it should be good enough for most uses, right?

This is what I am doing for the specific case I'm working on, but I agree with 
Walter here.  As Solr starts to evolve to power apps where you want to do 
complex functions based on the results of queries, the loss of precision can be 
quite bad.

-Grant



Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  wrote:
> Float is almost never good enough. The loss of precision is horrific.

Are you saying it's not good enough for this case (the final answer of
a relative distance calculation)?
7 digits of precision is enough to represent a distance across the US
down to the meter... and points closer together would have higher
precision of course.

For storage of the points themselves, 32 bit floats may also often be
enough (~2.4 meter resolution at the equator).  Allowing doubles as an
option would be nice too - but I expect that doubling the fieldcache
may not be worth it for many.
Actually, a 32 bit fixed point representation would have a lot more
accuracy for this (256 times the resolution at the cost of on-the-fly
conversion to a double for calculations).

-Yonik
http://www.lucidimagination.com


Re: Functions, floats and doubles

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll  wrote:
> Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
> pseudo fields and other capabilities, it would be nice to have doubles.

We have doubles already...  It's just that our general purpose
functions (like sum) don't use them.
geo functions could certainly use them.

>> But for something like gdist(point_a,point_b), the internal
>> calculations can be done in double precision and if the result is cast
>> to a float at the end, it should be good enough for most uses, right?
>
> This is what I am doing for the specific case I'm working on, but I agree 
> with Walter here.  As Solr starts to evolve to power apps where you want to 
> do complex functions based on the results of queries, the loss of precision 
> can be quite bad.

I agree with you all that eventually we want generic double precision support.
What I don't understand is if anyone thinks it's a blocker for geo, and why.

-Yonik
http://www.lucidimagination.com


Re: Functions, floats and doubles

2009-11-13 Thread Walter Underwood
Float is often OK until you try and use it for further calculation. Maybe it is 
good enough for printing out distance, but maybe not for further use.

wunder

On Nov 13, 2009, at 10:32 AM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  
> wrote:
>> Float is almost never good enough. The loss of precision is horrific.
> 
> Are you saying it's not good enough for this case (the final answer of
> a relative distance calculation)?
> 7 digits of precision is enough to represent a distance across the US
> down to the meter... and points closer together would have higher
> precision of course.
> 
> For storage of the points themselves, 32 bit floats may also often be
> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
> option would be nice too - but I expect that doubling the fieldcache
> may not be worth it for many.
> Actually, a 32 bit fixed point representation would have a lot more
> accuracy for this (256 times the resolution at the cost of on-the-fly
> conversion to a double for calculations).
> 
> -Yonik
> http://www.lucidimagination.com
> 



Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll
If I implement Vincenty's formula for distance between two points on an 
ellipsoid that can be accurate down to the 0.5mm.  Not doing that yet and not 
planning on implementing, so this is not a huge issue right now.

Still, I think we should put it on our roadmap. 


On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood  
> wrote:
>> Float is almost never good enough. The loss of precision is horrific.
> 
> Are you saying it's not good enough for this case (the final answer of
> a relative distance calculation)?
> 7 digits of precision is enough to represent a distance across the US
> down to the meter... and points closer together would have higher
> precision of course.
> 
> For storage of the points themselves, 32 bit floats may also often be
> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
> option would be nice too - but I expect that doubling the fieldcache
> may not be worth it for many.
> Actually, a 32 bit fixed point representation would have a lot more
> accuracy for this (256 times the resolution at the cost of on-the-fly
> conversion to a double for calculations).
> 
> -Yonik
> http://www.lucidimagination.com




Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 1:35 PM, Yonik Seeley wrote:

> On Fri, Nov 13, 2009 at 1:31 PM, Grant Ingersoll  wrote:
>> Yeah, in the end, Lucene Scorer returns a float...  However, if we allow for 
>> pseudo fields and other capabilities, it would be nice to have doubles.
> 
> We have doubles already...  It's just that our general purpose
> functions (like sum) don't use them.
> geo functions could certainly use them.

Yep, I have a patch for the QueryParsing, etc. to allow me to get doubles from 
that.  It should suffice for now.


> 
>>> But for something like gdist(point_a,point_b), the internal
>>> calculations can be done in double precision and if the result is cast
>>> to a float at the end, it should be good enough for most uses, right?
>> 
>> This is what I am doing for the specific case I'm working on, but I agree 
>> with Walter here.  As Solr starts to evolve to power apps where you want to 
>> do complex functions based on the results of queries, the loss of precision 
>> can be quite bad.
> 
> I agree with you all that eventually we want generic double precision support.
> What I don't understand is if anyone thinks it's a blocker for geo, and why.

Definitely not a blocker.  I'll put up a patch on 
https://issues.apache.org/jira/browse/SOLR-1302 and people can kick it around.

Lucene MMAP Usage with Solr

2009-11-13 Thread ST ST
Folks,

I am trying to get Lucene MMAP to work in solr.

I am assuming that when I configure MMAP the entire index will be loaded
into RAM.
Is that the right assumption ?

I have tried the following ways for using MMAP:

Option 1. Using the solr config below for MMAP configuration

-Dorg.apache.lucene.FSDirectory.class=org.apache.lucene.store.MMapDirectory

   With this config, when I start solr with a 30G index, I expected that the
RAM usage should go up, but it did not.

Option 2. By Code Change
I made the following code change :

   Changed org.apache.solr.core.StandardDirectoryFactory to use MMAP instead
of FSDirectory.
   Code snippet pasted below.


Could you help me to understand if these are the right way to use MMAP?

Thanks much
/ST.

Code SNippet for Option 2:

package org.apache.solr.core;
/**
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

import java.io.File;
import java.io.IOException;

import org.apache.lucene.store.Directory;
import org.apache.lucene.store.MMapDirectory;

/**
 * Directory provider which mimics original Solr FSDirectory based behavior.
 *
 */
public class StandardDirectoryFactory extends DirectoryFactory {

  public Directory open(String path) throws IOException {
return MMapDirectory.open(new File(path));
  }
}


Re: Functions, floats and doubles

2009-11-13 Thread Mattmann, Chris A (388J)
On 11/13/09 11:38 AM, "Grant Ingersoll"  wrote:

> If I implement Vincenty's formula for distance between two points on an
> ellipsoid that can be accurate down to the 0.5mm.  Not doing that yet and not
> planning on implementing, so this is not a huge issue right now.
> 
> Still, I think we should put it on our roadmap.

+1

Cheers,
Chris

> 
> 
> On Nov 13, 2009, at 1:32 PM, Yonik Seeley wrote:
> 
>> On Fri, Nov 13, 2009 at 1:01 PM, Walter Underwood 
>> wrote:
>>> Float is almost never good enough. The loss of precision is horrific.
>> 
>> Are you saying it's not good enough for this case (the final answer of
>> a relative distance calculation)?
>> 7 digits of precision is enough to represent a distance across the US
>> down to the meter... and points closer together would have higher
>> precision of course.
>> 
>> For storage of the points themselves, 32 bit floats may also often be
>> enough (~2.4 meter resolution at the equator).  Allowing doubles as an
>> option would be nice too - but I expect that doubling the fieldcache
>> may not be worth it for many.
>> Actually, a 32 bit fixed point representation would have a lot more
>> accuracy for this (256 times the resolution at the cost of on-the-fly
>> conversion to a double for calculations).
>> 
>> -Yonik
>> http://www.lucidimagination.com
> 
> 
> 

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++





[jira] Created: (SOLR-1562) Functions should better support double values

2009-11-13 Thread Grant Ingersoll (JIRA)
Functions should better support double values
-

 Key: SOLR-1562
 URL: https://issues.apache.org/jira/browse/SOLR-1562
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor


See 
http://www.lucidimagination.com/search/document/da1c5342683e2235/functions_floats_and_doubles#fd5c9c99ddf4d6a0

Some functions may need more precision than float.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Functions, floats and doubles

2009-11-13 Thread Grant Ingersoll

On Nov 13, 2009, at 1:48 PM, Mattmann, Chris A (388J) wrote:

>> 
>> Still, I think we should put it on our roadmap.

SOLR-1562.


Re: union functionality?

2009-11-13 Thread Chris Hostetter

http://people.apache.org/~hossman/#solr-dev
Please Use "solr-u...@lucene" Not "solr-...@lucene"

Your question is better suited for the solr-u...@lucene mailing list ...
not the solr-...@lucene list.  solr-dev is for discussing development of
the internals of the Solr application ... it is *not* the appropriate
place to ask questions about how to use Solr (or write Solr plugins) 
when developing your own applications.  Please resend your message to
the solr-user mailing list, where you are likely to get more/better
responses since that list also has a larger number of subscribers.



: Date: Fri, 6 Nov 2009 13:03:03 +0800
: From: Tang Minzhi 
: Reply-To: solr-dev@lucene.apache.org
: To: "solr-dev@lucene.apache.org" 
: Subject: union functionality?
: 
: Hey everyone,
: 
: I'm newbie to solr.
: 
: I have a question below:
: Assume that I have the fields 'item_name' and 'category_name', like
: item_name | category_name
: apple fruit
: orange  fruit
: ricegrain
: corn  grain
: strawberryfruit
: beef  meat
: 
: 
: I want to fetch the result set of "top 2 item of each categories I specified 
(might be 2 of them 'fruit', 'grain')" at one time, instead of fetch them 
through several requests, is it possible with solr?
: 
: It seems that in mysql database, there's union functionality which can 
combine several request queries together, so how about solr?
: 
: Thanks.
: 
: Best regards,
: 
: Minzhi Tang
: Skype: minzhitang
: MSN: tmin...@hotmail.com



-Hoss



[jira] Commented: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2009-11-13 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777648#action_12777648
 ] 

Chris A. Mattmann commented on SOLR-1302:
-

Hey Grant,

I'd love to help on this issue -- do you have a preference regarding which one 
of the above 4 that you'd like to work on? All of them? 2 of them?

Let me know -- I'd be happy to write 1 or 2 of them...

Cheers,
Chris


> Fun with Distances - Add Distance functions for a variety of things
> ---
>
> Key: SOLR-1302
> URL: https://issues.apache.org/jira/browse/SOLR-1302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> There are many distance functions that are useful to have:
> 1. Great Circle (lat/lon) and other geo distances
> 2. Euclidean (Vector)
> 3. Manhattan (Vector)
> 4. Cosine (Vector)
> For the vector ones, the idea is that the fields on a document can be used to 
> determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-11-13 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777649#action_12777649
 ] 

Martijn van Groningen commented on SOLR-236:


Thomas, the method that cannot be found ( SolrIndexSearcher.getDocSet(...) ) is 
a method that is part of the patch. So if the patch was successful applied then 
this should not happen. 
When I released the latest patch I only tested against the solr trunk, but I 
have tried the following to verify that the patch works with 1.4.0 release:
* Dowloaded 1.4.0 release from Solr site
* Applied the patch
* Executed: ant clean dist example
* In the example config (example/solr/conf/solrconfig.xml) I added the 
following line under the standard request handler:
{code:xml}{code}
* Started the Jetty with Solr with the following command: java -jar start.jar
* Added example data to Solr with the following command in the exampledocs dir: 
./post.sh *.xml
* I Browsed to the following url: 
http://localhost:8983/solr/select/?q=*:*&collapse.field=inStock and saw that 
the result was collapsed on the inStock field.

It seems that everything is running fine. Can you tell something about how you 
deployed Solr on your machine?

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Solr 1.4.0 vs 1.4.1

2009-11-13 Thread Chris Hostetter

: So which would be appropriate?  My concern is that the 1.4.1 has a larger
: file size so don't want to miss some 1.4 feature or bug fix that happened

I don't understand this question ... what do you mean by "1.4.1 has a 
larger file size" ?

building 1.4.0 from the source in the official release should produce the 
same jar as what is included in that release -- the only distinction is 
the name of the artifacts produced if you don't set the version params on 
the command line (i suppose if you build without specifying the version 
the artifacts will be slightly bigger because they contain metadata that 
includes the (longer) "-dev" version, and the names of the jars included 
in the war will have that as well) but the code is still the same.


-Hoss



Re: [Solr Wiki] Trivial Update of "UpdateCSV" by kristopolo us

2009-11-13 Thread Chris Hostetter

: The "UpdateCSV" page has been changed by kristopolous.
: The comment on this change is: grammar.
: http://wiki.apache.org/solr/UpdateCSV?action=diff&rev1=13&rev2=14

Did someone revert this change? ... i was going to (since the old version 
was actually correct) but the wiki says the current version is still "13" 
-- even though that diff URL still works, as does this one requesting to 
explicitly display version #14...

  http://wiki.apache.org/solr/UpdateCSV?action=recall&rev=14

...if someone did revert it, then why didn't an email get sent out? and if 
no one reverted it (yet), then WTF happened to the edit?

: - Keep and index empty (zero length) field values.  This may be specified 
globally, or on a per-field basis.  The default is {{{keepEmpty=false}}}.
: + Keep an index empty (zero length) field values.  This may be specified 
globally, or on a per-field basis.  The default is {{{keepEmpty=false}}}.



-Hoss



Re: [Solr Wiki] Trivial Update of "UpdateCSV" by kristopolous

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 3:55 PM, Chris Hostetter
 wrote:
>
> : The "UpdateCSV" page has been changed by kristopolous.
> : The comment on this change is: grammar.
> : http://wiki.apache.org/solr/UpdateCSV?action=diff&rev1=13&rev2=14
>
> Did someone revert this change? ... i was going to (since the old version
> was actually correct) but the wiki says the current version is still "13"
> -- even though that diff URL still works, as does this one requesting to
> explicitly display version #14...
>
>  http://wiki.apache.org/solr/UpdateCSV?action=recall&rev=14
>
> ...if someone did revert it, then why didn't an email get sent out? and if
> no one reverted it (yet), then WTF happened to the edit?

Good question... I tried to revert it too when I first saw it.
The live page actually showed the new edited version, but every time I
tried to edit it, I got the old text.  Eventually, the live page
showed the old version.  The history never showed the new edit AFAIK.
Was going to bring it up, but got sidetracked by something else.

-Yonik
http://www.lucidimagination.com


[jira] Commented: (SOLR-236) Field collapsing

2009-11-13 Thread Thomas Woodard (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777659#action_12777659
 ] 

Thomas Woodard commented on SOLR-236:
-

I tried the build again, and you are right, it does work fine with the default 
search handler. I had been trying to get it working with our search handler, 
which is dismax. That still doesn't work. Here is the handler configuration, 
which works fine until collapsing is added.

{code:xml}


dismax
name^3 description^2 long_description^2 
search_stars^1 search_directors^1 product_id^0.1
0.1
true
stars
directors
keywords
studio
1


{code}

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-11-13 Thread Thomas Woodard (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777659#action_12777659
 ] 

Thomas Woodard edited comment on SOLR-236 at 11/13/09 9:10 PM:
---

I tried the build again, and you are right, it does work fine with the default 
search handler. I had been trying to get it working with our search handler, 
which is dismax. That still doesn't work. Here is the handler configuration, 
which works fine until collapsing is added.

{code:xml}


dismax
name^3 description^2 long_description^2 
search_stars^1 search_directors^1 product_id^0.1
0.1
true
stars
directors
keywords
studio
1


{code}

Edit: The search fails even if you don't pass a collapse field.

  was (Author: gtfoomw):
I tried the build again, and you are right, it does work fine with the 
default search handler. I had been trying to get it working with our search 
handler, which is dismax. That still doesn't work. Here is the handler 
configuration, which works fine until collapsing is added.

{code:xml}


dismax
name^3 description^2 long_description^2 
search_stars^1 search_directors^1 product_id^0.1
0.1
true
stars
directors
keywords
studio
1


{code}
  
> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [Solr Wiki] Trivial Update of "UpdateCSV" by kristopolous

2009-11-13 Thread Chris Hostetter

: Good question... I tried to revert it too when I first saw it.
: The live page actually showed the new edited version, but every time I
: tried to edit it, I got the old text.  Eventually, the live page
: showed the old version.  The history never showed the new edit AFAIK.
: Was going to bring it up, but got sidetracked by something else.

i don't know WTF is going on ... i just verified (using Solr1.5) that 
reverting not only still triggers email notifications, but reverting still 
leaves a history of the revert in the log (it doens't roll back the 
version number, it rolls forward)

sigh.


-Hoss



[jira] Commented: (SOLR-1432) FunctionQueries aren't correctly weighted

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777674#action_12777674
 ] 

Hoss Man commented on SOLR-1432:


bq. I wish this issue would have been called out in the CHANGES file for the 
1.4.0 release

You're right ... it was a pretty big oversight on our part that it wasn't 
mentioned anywhere (let alone specificly called out in the "Upgrading" section.

retroactively editing CHANGES.txt isn't really feasible, but i've added it to 
the Solr1.4 wiki page to try and increase the visibility a bit...

http://wiki.apache.org/solr/Solr1.4

> FunctionQueries aren't correctly weighted
> -
>
> Key: SOLR-1432
> URL: https://issues.apache.org/jira/browse/SOLR-1432
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.4
>
> Attachments: SOLR-1432.patch, SOLR-1432.patch
>
>
> Nested queries in function queries aren't weighted correctly with the proper 
> Searcher, and this is now even more serious with per-segment searching in 
> Lucene/Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2009-11-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777676#action_12777676
 ] 

Grant Ingersoll commented on SOLR-1302:
---

I've got Haversine implemented (great circle), and RadianFunction (convert a 
val to radians).  I'll put up a patch shortly.

> Fun with Distances - Add Distance functions for a variety of things
> ---
>
> Key: SOLR-1302
> URL: https://issues.apache.org/jira/browse/SOLR-1302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
>
> There are many distance functions that are useful to have:
> 1. Great Circle (lat/lon) and other geo distances
> 2. Euclidean (Vector)
> 3. Manhattan (Vector)
> 4. Cosine (Vector)
> For the vector ones, the idea is that the fields on a document can be used to 
> determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Solr 1.4.0 vs 1.4.1

2009-11-13 Thread Jeff Newburn
That is fine we are using 1.4.0.  When I compiled the 1.4.1 was 2k extra in
size.
-- 
Jeff Newburn
Software Engineer, Zappos.com
jnewb...@zappos.com - 702-943-7562


> From: Chris Hostetter 
> Reply-To: 
> Date: Fri, 13 Nov 2009 12:47:34 -0800 (PST)
> To: 
> Subject: Re: Solr 1.4.0 vs 1.4.1
> 
> 
> : So which would be appropriate?  My concern is that the 1.4.1 has a larger
> : file size so don't want to miss some 1.4 feature or bug fix that happened
> 
> I don't understand this question ... what do you mean by "1.4.1 has a
> larger file size" ?
> 
> building 1.4.0 from the source in the official release should produce the
> same jar as what is included in that release -- the only distinction is
> the name of the artifacts produced if you don't set the version params on
> the command line (i suppose if you build without specifying the version
> the artifacts will be slightly bigger because they contain metadata that
> includes the (longer) "-dev" version, and the names of the jars included
> in the war will have that as well) but the code is still the same.
> 
> 
> -Hoss
> 



[jira] Updated: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2009-11-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1302:
--

Attachment: SOLR-1302.patch

Haversine implementation, RadianFunction and DegreeFunction.  Also small 
refactorings in other places to better support doubles to avoid losing 
precision for as long as possible.

Next up:  Euclidean and SquaredEuclidean

> Fun with Distances - Add Distance functions for a variety of things
> ---
>
> Key: SOLR-1302
> URL: https://issues.apache.org/jira/browse/SOLR-1302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1302.patch
>
>
> There are many distance functions that are useful to have:
> 1. Great Circle (lat/lon) and other geo distances
> 2. Euclidean (Vector)
> 3. Manhattan (Vector)
> 4. Cosine (Vector)
> For the vector ones, the idea is that the fields on a document can be used to 
> determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2009-11-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1227#action_1227
 ] 

Grant Ingersoll commented on SOLR-1302:
---

Note, we may want to mod this to work more with a Field Type that represents a 
Point.  This depends on SOLR-1131.

> Fun with Distances - Add Distance functions for a variety of things
> ---
>
> Key: SOLR-1302
> URL: https://issues.apache.org/jira/browse/SOLR-1302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1302.patch
>
>
> There are many distance functions that are useful to have:
> 1. Great Circle (lat/lon) and other geo distances
> 2. Euclidean (Vector)
> 3. Manhattan (Vector)
> 4. Cosine (Vector)
> For the vector ones, the idea is that the fields on a document can be used to 
> determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)
NPE's occur in various situations related to FIeldable.stringValue when 
upgrading from 1.3 to 1.4
-

 Key: SOLR-1563
 URL: https://issues.apache.org/jira/browse/SOLR-1563
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Hoss Man
Priority: Critical


Multiple reports of NPEs when using Solr 1.4 - so far these all seem to relate 
to getting a null returned by Fieldable.stringValue when it isn't expected or 
accounted for.  Thread where this was initially discussed...

http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1285#action_1285
 ] 

Hoss Man commented on SOLR-1563:



First anecdotal NPE stack trace reported by Jake Brownell when using 
LukeRequestHandler to look at a specific document id when using 1.4.  No 
details about schema provided../

{noformat}
/admin/luke?id=1

1. java.lang.NullPointerException
2. at org.apache.lucene.index.TermBuffer.set(TermBuffer.java:95)
3. at 
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:158)
4. at 
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
5. at 
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
6. at 
org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975)
7. at 
org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627)
8. at 
org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308)
9. at 
org.apache.solr.handler.admin.LukeRequestHandler.getDocumentFieldsInfo(LukeRequestHandler.java:248)
10.at 
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:124)
11.at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
12.at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
{noformat}

> NPE's occur in various situations related to FIeldable.stringValue when 
> upgrading from 1.3 to 1.4
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1286#action_1286
 ] 

Hoss Man commented on SOLR-1563:


Second anecdotal exception reported by Solr Trey, also while using luke to look 
at a specific document id.  no details about schema...

{noformat}
HTTP Status 500 - null java.lang.NullPointerException at
org.apache.lucene.index.Term.compareTo(Term.java:119) at
org.apache.lucene.index.TermInfosReader.getIndexOffset(TermInfosReader.java:160)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:231) at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179) at
org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975) at
org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627) at
org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308) at
org.apache.solr.handler.admin.LukeRequestHandler.getDocumentFieldsInfo(LukeRequestHandler.java:248)
at
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:124)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
{noformat}

> NPE's occur in various situations related to FIeldable.stringValue when 
> upgrading from 1.3 to 1.4
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1288#action_1288
 ] 

Hoss Man commented on SOLR-1563:


The following stack trace was generated using Solr 1.4, and the "example" 
configs provided with Solr 1.4.

Steps to reproduce...
# cd example && java -jar start.jar
# cd example/exampledocs && java -jar post.jar  *.xml
# open in browser: http://localhost:8983/solr/select/?q=SP2514N (this works 
fine)
# open in browser: http://localhost:8983/solr/admin/luke?id=SP2514N (this fails 
with stack trace below)

{noformat}
java.lang.NullPointerException
at org.apache.lucene.index.TermBuffer.set(TermBuffer.java:95)
at 
org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:158)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975)
at 
org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627)
at 
org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getDocumentFieldsInfo(LukeRequestHandler.java:248)
at 
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:124)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
{noformat}

> NPE's occur in various situations related to FIeldable.stringValue when 
> upgrading from 1.3 to 1.4
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1290#action_1290
 ] 

Hoss Man commented on SOLR-1563:


These stack traces demonstrate similar NPE exceptions when attempting to 
upgrade from 1.3 to 1.4 using the example schema provided in 1.3...

steps to reproduce...
* using 1.3...
*# cd example && java -jar start.jar
*# cd example/exampledocs && java -jar post.jar *.xml
*# note these urls work and return meaningful data...
*#* http://localhost:8983/solr/select/?q=SP2514N
*#* http://localhost:8983/solr/admin/luke?id=SP2514N
*# shutdown the 1.3 port
* using solr 1.4, point it at your 1.3 example solr home dir...
*# cd example && java -Dsolr.solr.home=../../1.3_solr/example/solr/ -jar 
start.jar
*# note that solr starts cleanly with no errors
*# verify that this URL displays the solr home you expect, in the 1.3 example 
directory...
*#* http://localhost:8983/solr/admin/
*# note that these URLs fail with the errors listed below...
*#* http://localhost:8983/solr/select/?q=SP2514N
*#* http://localhost:8983/solr/admin/luke?id=SP2514N

{noformat}
http://localhost:8983/solr/select/?q=SP2514N

java.lang.NullPointerException
at 
org.apache.solr.schema.SortableIntField.write(SortableIntField.java:72)
at org.apache.solr.schema.SchemaField.write(SchemaField.java:108)
at org.apache.solr.request.XMLWriter.writeDoc(XMLWriter.java:311)
at org.apache.solr.request.XMLWriter$3.writeDocs(XMLWriter.java:483)
at org.apache.solr.request.XMLWriter.writeDocuments(XMLWriter.java:420)
at org.apache.solr.request.XMLWriter.writeDocList(XMLWriter.java:457)
at org.apache.solr.request.XMLWriter.writeVal(XMLWriter.java:520)
at org.apache.solr.request.XMLWriter.writeResponse(XMLWriter.java:130)
at 
org.apache.solr.request.XMLResponseWriter.write(XMLResponseWriter.java:34)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:325)
{noformat}

{noformat}
http://localhost:8983/solr/admin/luke?id=SP2514N

java.lang.NullPointerException
at 
org.apache.solr.util.NumberUtils.SortableStr2int(NumberUtils.java:127)
at 
org.apache.solr.util.NumberUtils.SortableStr2float(NumberUtils.java:83)
at 
org.apache.solr.util.NumberUtils.SortableStr2floatStr(NumberUtils.java:89)
at 
org.apache.solr.schema.SortableFloatField.indexedToReadable(SortableFloatField.java:62)
at 
org.apache.solr.schema.SortableFloatField.toExternal(SortableFloatField.java:53)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getDocumentFieldsInfo(LukeRequestHandler.java:245)
at 
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:124)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
{noformat}





> NPE's occur in various situations related to FIeldable.stringValue when 
> upgrading from 1.3 to 1.4
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1563) NPE's occur in various situations related to FIeldable.stringValue when upgrading from 1.3 to 1.4

2009-11-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1292#action_1292
 ] 

Hoss Man commented on SOLR-1563:


In all of the stack traces listed above, the NPE seems to be orriginating from 
an attempt to derefrence a string value that was returned by 
Fieldable.stringValue higher up in the call stack.

Given the number of different code paths involved (some of which haven't 
changed from Solr 1.3 to Solr 1.4) I can't help but think something changed 
deep in lucene to cause null to be returned in cases where it wasn't returned 
in older versions of Lucene.  So far i haven't been able to figure out where.

> NPE's occur in various situations related to FIeldable.stringValue when 
> upgrading from 1.3 to 1.4
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.