[jira] Updated: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-03 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated LUCENE-2657: Attachment: LUCENE-2657.patch All tests pass again with this patch. Solr test resource structrual

[jira] Commented: (LUCENE-2214) Remove deprecated StemExclusionSet setters in contrib/analyzers

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976688#action_12976688 ] Simon Willnauer commented on LUCENE-2214: - This seems to be invalid since

[jira] Resolved: (LUCENE-2612) Add fetch-javacc task to common-build.xml

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-2612. - Resolution: Not A Problem doesn't seem to be worth it... Add fetch-javacc task to

[jira] Commented: (SOLR-1942) Ability to select codec per field

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976691#action_12976691 ] Simon Willnauer commented on SOLR-1942: --- bq. updated to trunk - if somebody has time a

[jira] Resolved: (SOLR-2031) QueryComponent's default query parser should be configurable from solrconfig.xml

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved SOLR-2031. --- Resolution: Not A Problem after all this doesn't seem to be really needed QueryComponent's

[jira] Resolved: (LUCENE-1747) Contrib/Spatial needs code cleanup before release

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-1747. - Resolution: Won't Fix I think this is outdated and spatial with rather go away than

[jira] Created: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-03 Thread Michael McCandless (JIRA)
Add variable-gap terms index impl. -- Key: LUCENE-2843 URL: https://issues.apache.org/jira/browse/LUCENE-2843 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter:

[jira] Commented: (LUCENE-2101) Default Stopwords should use specific Version in CharArraySet construtor

2011-01-03 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976696#action_12976696 ] Simon Willnauer commented on LUCENE-2101: - I think we can simple move that to

[jira] Updated: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2843: --- Attachment: LUCENE-2843.patch Attached patch. Still some nocommits but I think

[jira] Commented: (LUCENE-2836) FieldCache rewrite method for MultiTermQueries

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976707#action_12976707 ] Michael McCandless commented on LUCENE-2836: This is a great speedup for the

[jira] Commented: (LUCENE-2836) FieldCache rewrite method for MultiTermQueries

2011-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976729#action_12976729 ] Robert Muir commented on LUCENE-2836: - OK, I'll work on getting it into contrib. I

[jira] Commented: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2011-01-03 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976752#action_12976752 ] Andrzej Bialecki commented on LUCENE-1812: --- Doron, feel free to work on this -

Re: [jira] Commented: (SOLR-2218) Performance of start= and rows= parameters are exponentially slow with large data sets

2011-01-03 Thread Yonik Seeley
On Thu, Nov 11, 2010 at 3:22 PM, Jan Høydahl / Cominvent jan@cominvent.com wrote: The problem with large start is probably worse when sharding is involved. Anyone know how the shard component goes about fetching start=100rows=10 from say 10 shards? Does it have to merge sorted lists

[jira] Updated: (SOLR-2129) Provide a Solr module for dynamic metadata extraction/indexing with Apache UIMA

2011-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2129: -- Attachment: SOLR-2129.patch patch synced to trunk. i also adjusted some minor things: doesn't rely on

[jira] Created: (LUCENE-2844) benchmark geospatial performance based on geonames.org

2011-01-03 Thread David Smiley (JIRA)
benchmark geospatial performance based on geonames.org -- Key: LUCENE-2844 URL: https://issues.apache.org/jira/browse/LUCENE-2844 Project: Lucene - Java Issue Type: New Feature

[jira] Updated: (LUCENE-2844) benchmark geospatial performance based on geonames.org

2011-01-03 Thread David Smiley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-2844: - Attachment: benchmark-geo.patch benchmark geospatial performance based on geonames.org

Re: Geospatial search in Lucene/Solr

2011-01-03 Thread David Smiley (@MITRE.org)
As a follow-up to this thread, I've contributed my geospatial benchmark performance code here: https://issues.apache.org/jira/browse/LUCENE-2844 benchmark geospatial performance based on geonames.org - Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this

[jira] Commented: (SOLR-2155) Geospatial search using geohash prefixes

2011-01-03 Thread David Smiley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976820#action_12976820 ] David Smiley commented on SOLR-2155: For evaluating the performance of geospatial

[jira] Commented: (LUCENE-2844) benchmark geospatial performance based on geonames.org

2011-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976823#action_12976823 ] Robert Muir commented on LUCENE-2844: - David, I'll first create an issue to propose

[jira] Created: (LUCENE-2845) move contrib/benchmark to modules/benchmark

2011-01-03 Thread Robert Muir (JIRA)
move contrib/benchmark to modules/benchmark --- Key: LUCENE-2845 URL: https://issues.apache.org/jira/browse/LUCENE-2845 Project: Lucene - Java Issue Type: Task Components: Build

[jira] Updated: (LUCENE-2845) move contrib/benchmark to modules/benchmark

2011-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2845: Attachment: LUCENE-2845.patch patch, apply after doing 'svn move lucene/contrib/benchmark

Re: Testing our code for SolrCloud

2011-01-03 Thread Soheb Mahmood
Hello Mark! Apologies for the late reply! Do you mind creating a JIRA issue and attaching a patch? That is usually the best way to go about these discussions. We have done so here: https://issues.apache.org/jira/browse/SOLR-2287. Unfortunately, our test cases are incomplete at the moment, but

[jira] Commented: (SOLR-2129) Provide a Solr module for dynamic metadata extraction/indexing with Apache UIMA

2011-01-03 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976862#action_12976862 ] Mark Miller commented on SOLR-2129: --- bq. I have no problem committing this to contrib so

FYI: Javadoc update needed re: omitTf

2011-01-03 Thread Mark Miller
/** Expert: * * If set, omit term freq, positions and payloads from * postings for this field. * * pbNOTE/b: While this option reduces storage space * required in the index, it also means any query * requiring positional information, such as {...@link * PhraseQuery} or {...@link

Re: FYI: Javadoc update needed re: omitTf

2011-01-03 Thread Simon Willnauer
On Mon, Jan 3, 2011 at 7:49 PM, Mark Miller markrmil...@gmail.com wrote:  /** Expert:  *  * If set, omit term freq, positions and payloads from  * postings for this field.  *  * pbNOTE/b: While this option reduces storage space  * required in the index, it also means any query  * requiring

Re: FYI: Javadoc update needed re: omitTf

2011-01-03 Thread Robert Muir
On Mon, Jan 3, 2011 at 1:49 PM, Mark Miller markrmil...@gmail.com wrote: Perhaps should say, *may* silently fail? SpanTermQuery will explicitly throw an exception. Does PhraseQuery still silently fail these days? not in trunk, its loud too.

[jira] Commented: (LUCENE-2845) move contrib/benchmark to modules/benchmark

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976886#action_12976886 ] Michael McCandless commented on LUCENE-2845: +1 move contrib/benchmark to

Fwd: [Solr Wiki] Update of FrontPage by DavidSmiley

2011-01-03 Thread Grant Ingersoll
Kind of a nit-pick, but I don't think this needs to be limited to just geographical search. We actually have clients who use the spatial filtering in non-lat/lon uses (and it was designed with such in mind, hence the support for n-dimensional distance calculations). Perhaps we should leave it

[jira] Commented: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976893#action_12976893 ] Michael McCandless commented on LUCENE-2843: bq. Just curious, how would the

Re: Geospatial search in Lucene/Solr

2011-01-03 Thread Grant Ingersoll
On Dec 28, 2010, at 1:02 PM, Robert Muir wrote: On Tue, Dec 28, 2010 at 11:59 AM, Smiley, David W. dsmi...@mitre.org wrote: Thanks for letting me know about this Rob. I think geonames is much simpler (and much less data) to work with than wikipedia. It's plain tab-delimited and I like

Re: FYI: Javadoc update needed re: omitTf

2011-01-03 Thread Yonik Seeley
On Mon, Jan 3, 2011 at 2:03 PM, Simon Willnauer simon.willna...@googlemail.com wrote: While we are on it, would it make sense to move omitTfAP into the Index enum. It always felt odd that you can omit norms using the enum but use a setter to omit TF Pos. I think the attempted move to type

[jira] Created: (LUCENE-2846) omitTF is viral, but omitNorms is anti-viral.

2011-01-03 Thread Robert Muir (JIRA)
omitTF is viral, but omitNorms is anti-viral. - Key: LUCENE-2846 URL: https://issues.apache.org/jira/browse/LUCENE-2846 Project: Lucene - Java Issue Type: Improvement Reporter: Robert

[jira] Commented: (SOLR-1782) stats.facet assumes FieldCache.StringIndex - fails horribly on multivalued fields

2011-01-03 Thread Johannes Goll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976902#action_12976902 ] Johannes Goll commented on SOLR-1782: - Wojtek and Hoss Man - are you planning to release

Re: strange problem of PForDelta decoder

2011-01-03 Thread Michael McCandless
Here's the paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.156.8091 I haven't read it yet... In general I don't like tying concurrency w/in a single search to index segments; I'd rather they be (relatively?) independent. EG an optimized index would then force single thread

[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976928#action_12976928 ] Michael McCandless commented on LUCENE-2840: bq. Using fewer threads

Re: FYI: Javadoc update needed re: omitTf

2011-01-03 Thread Simon Willnauer
On Mon, Jan 3, 2011 at 8:26 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Jan 3, 2011 at 2:03 PM, Simon Willnauer simon.willna...@googlemail.com wrote: While we are on it, would it make sense to move omitTfAP into the Index enum. It always felt odd that you can omit norms using

[jira] Commented: (LUCENE-2846) omitTF is viral, but omitNorms is anti-viral.

2011-01-03 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976947#action_12976947 ] Michael McCandless commented on LUCENE-2846: +1 for omitNorms to be viral.

[jira] Updated: (LUCENE-2846) omitTF is viral, but omitNorms is anti-viral.

2011-01-03 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2846: Fix Version/s: 4.0 omitTF is viral, but omitNorms is anti-viral.

Lucene-Solr-tests-only-trunk - Build # 3350 - Still Failing

2011-01-03 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/3350/ All tests passed Build Log (for compile errors): [...truncated 6586 lines...] clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover:

[jira] Updated: (SOLR-2116) TikaEntityProcessor does not find parser by default

2011-01-03 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated SOLR-2116: Attachment: SOLR-2116.patch I've encountered the same issue on my Solr setup. After

[jira] Issue Comment Edited: (SOLR-2116) TikaEntityProcessor does not find parser by default

2011-01-03 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976986#action_12976986 ] Martijn van Groningen edited comment on SOLR-2116 at 1/3/11 5:23 PM:

[jira] Commented: (SOLR-2116) TikaEntityProcessor does not find parser by default

2011-01-03 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977067#action_12977067 ] Lance Norskog commented on SOLR-2116: - Great! I'll try it out on 3.x and trunk.

[jira] Commented: (SOLR-2116) TikaEntityProcessor does not find parser by default

2011-01-03 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977072#action_12977072 ] Chris A. Mattmann commented on SOLR-2116: - Hey Lance, bq. Speaking of Tika, have

Re: Geospatial search in Lucene/Solr

2011-01-03 Thread Lance Norskog
Great! I would suggest a new /modules for gis. It is worthwhile to have a /modules/gis/geonames for large-scale tests/demos/benchmarks, with ant scripts to download datasets and run the tests. About demos: there is a lot of GEO code out there: libraries (http://www.openmap.org/), data (geonames,

Re: strange problem of PForDelta decoder

2011-01-03 Thread Li Li
I agree with you that we should not tie concurrency w/in a single search to index segments. That solution is just a hack. will lucene 4 support multithreads search for a single query? I haven't found any patch about this. 2011/1/4 Michael McCandless luc...@mikemccandless.com: Here's the paper: