date:20110519

Re: Moving towards Lucene 4.0

2011-05-19 Thread Simon Willnauer

On Thu, May 19, 2011 at 7:44 PM, Chris Hostetter
 wrote:
>
> : I think we should focus on everything that's *infrastructure* in 4.0, so
> : that we can develop additional features in subsequent 4.x releases. If we
> : end up releasing 4.0 just to discover many things will need to wait to 5.0,
> : it'll be a big loss.
>
> the catch with that approach (i'm speaking generally here, not with any of
> these particular lucene examples in mind) is that it's hard to know that
> the infrastructure really makes sense until you've built a bunch of stuff
> on it -- i think Josh Bloch has a paper where he says that you shouldn't
> publish an API abstraction until you've built at least 3 *real*
> (ie: not just toy or example) implementations of that API.

yeah big +1 - everybody should watch that tech talk... (
http://www.youtube.com/watch?v=aAb7hSCtvGw )
>
> it would be really easy to say "the infrastructure for X, Y, and Z is all
> in 4.0, features that leverage this infra will start coming in 4.1" and
> then discover on the way to 4.1 that we botched the APIs.
>
> what does this mean concretely for the specific "big ticket" changes that
> we've got on trunk? ... i dunno, just my word of caution.
>
> : > we just started the discussion about Lucene 3.2 and releasing more
> : > often. Yet, I think we should also start planning for Lucene 4.0 soon.
> : > We have tons of stuff in trunk that people want to have and we can't
> : > just keep on talking about it - we need to push this out to our users.
>
> I agree, but i think the other approach we should take is to be more
> agressive about reviewing things that would be good candidates for
> backporting.
>
> If we feel like some feature has a well defined API on trunk, and it's got
> good tests, and people have been using it and filing bugs and helping to
> make it better then we should consider it a candidate for backporting --
> if the merge itself looks like it would be a huge pain in hte ass we don't
> *have* to backport, but we should at least look.

I agree, we should backport what we can but we have to ensure some
balance between
amount of work vs. benefit. I mean one big thing which we could port
is DWPT almost all the other features
rely on the new flex API. So I am not sure if there is anything else
really well DocValues could be easy actually.

I still want to remind that we should not wait for too long with 4.0!

simon
>
> That may not help for any of the "big ticket" infra changes discussed in
> this thread (where we know it really needs to wait for a major release)
> but it would definitely help with the "get features out to users faster"
> issue.
>
>
>
> -Hoss
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources

2011-05-19 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036457#comment-13036457
 ] 

Bill Bell edited comment on SOLR-2371 at 5/20/11 6:02 AM:
--

Can we add a alpha sort min and max for non-numerical multiValued fields?

The use case is sort by As before Bs in the field.

If I have the following in he field:
aa


I add sort=min(field) asc

I should get sort by the "aa" field

Thanks







  was (Author: billnbell):
Can we add a alpha sort min and max for non-numerical multiValued fields?

The use case is sort by As before Bs in the field.

I do a fq=field:a* then I add sort=min(field) asc

Thanks






  
> Add a min() function query, upgrade max() function query to take two value 
> sources
> --
>
> Key: SOLR-2371
> URL: https://issues.apache.org/jira/browse/SOLR-2371
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: SOLR-2371.patch
>
>
> There doesn't appear to be a min() function.  Also, max() only allows a value 
> source and a constant b/c it is from before we had more flexible parsing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-05-19 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036689#comment-13036689
 ] 

Bill Bell commented on SOLR-2155:
-

Why are we abandoning this? I thought it was a good enhancement. I need this 
feature to be committed so that I can do multiple points per row.

We can mark it experimental?


> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-docvalues-branch - Build # 1148 - Still Failing

2011-05-19 Thread Apache Jenkins Server

Build: 
https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-docvalues-branch/1148/

No tests ran.

Build Log (for compile errors):
[...truncated 27 lines...]
+ TEST_LINE_DOCS_FILE=/home/hudson/lucene-data/enwiki.random.lines.txt.gz
+ TEST_JVM_ARGS='-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/'
+ set +x
Checking for files containing nocommit:
./lucene/src/java/org/apache/lucene/index/values/DocValues.java
+ mkdir -p 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps
+ rm -rf 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/README.txt
+ echo 'This directory contains heap dumps that may be generated by test runs 
when OOM occurred.'
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant clean
Buildfile: build.xml

clean:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
 [echo] Building analyzers-icu...

clean:
 [echo] Building analyzers-phonetic...

clean:
 [echo] Building analyzers-smartcn...

clean:
 [echo] Building analyzers-stempel...

clean:
 [echo] Building benchmark...

clean:
 [echo] Building grouping...

clean:

clean-contrib:

clean:

clean:

clean:

clean:

clean:

clean:

BUILD SUCCESSFUL
Total time: 1 second
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] Compiling 536 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/index/PerFieldCodecWrapper.java:309:
 cannot find symbol
[javac] symbol  : constructor IOException(java.io.IOException)
[javac] location: class java.io.IOException
[javac] err = new IOException(ioe);
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
[...truncated 11 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1125127 - /lucene/dev/trunk/lucene/CHANGES.txt

2011-05-19 Thread Doron Cohen

Thanks for catching/fixing this Robert!

I missed that for LUCENE-3068 - anything which gets back-ported to 3x should
be in the 3x-unreleased section, this way the upper part summarizes the
differences between trunk and 3x, neat!

Doron

On Fri, May 20, 2011 at 12:32 AM,  wrote:

> Author: rmuir
> Date: Thu May 19 21:32:15 2011
> New Revision: 1125127
>
> URL: http://svn.apache.org/viewvc?rev=1125127&view=rev
> Log:
> CHANGES cleanup
>
> Modified:
>lucene/dev/trunk/lucene/CHANGES.txt
>
> Modified: lucene/dev/trunk/lucene/CHANGES.txt
> URL:
> http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?rev=1125127&r1=1125126&r2=1125127&view=diff
>
> ==
> --- lucene/dev/trunk/lucene/CHANGES.txt (original)
> +++ lucene/dev/trunk/lucene/CHANGES.txt Thu May 19 21:32:15 2011
> @@ -444,27 +444,6 @@ Bug fixes
>   with more document deletions is requested before a reader with fewer
>   deletions, provided they share some segments. (yonik)
>
> -* LUCENE-2936: PhraseQuery score explanations were not correctly
> -  identifying matches vs non-matches.  (hossman)
> -
> -* LUCENE-2996: addIndexes(IndexReader) did not flush before adding the new
> -  indexes, causing existing deletions to be applied on the incoming
> indexes as
> -  well. (Shai Erera, Mike McCandless)
> -
> -* LUCENE-3068: sloppy phrase query failed to match valid documents when
> multiple
> -  query terms had same position in the query. (Doron Cohen)
> -
> -Test Cases
> -
> -* LUCENE-3002: added 'tests.iter.min' to control 'tests.iter' by allowing
> to
> -  stop iterating if at least 'tests.iter.min' ran and a failure occured.
> -  (Shai Erera, Chris Hostetter)
> -
> -Build
> -
> -* LUCENE-3006: Building javadocs will fail on warnings by default.
> -  Override with -Dfailonjavadocwarning=false (sarowe, gsingers)
> -
>  === Lucene 3.x (not yet released)
> ===
>
>  Changes in backwards compatibility policy
> @@ -564,9 +543,17 @@ Bug fixes
>   PhraseQuery as term with lower doc freq will also have less positions.
>   (Uwe Schindler, Robert Muir, Otis Gospodnetic)
>
> +* LUCENE-3068: sloppy phrase query failed to match valid documents when
> multiple
> +  query terms had same position in the query. (Doron Cohen)
> +
>  * LUCENE-3012: Lucene writes the header now for separate norm files
> (*.sNNN)
>   (Robert Muir)
>
> +Build
> +
> +* LUCENE-3006: Building javadocs will fail on warnings by default.
> +  Override with -Dfailonjavadocwarning=false (sarowe, gsingers)
> +
>  Test Cases
>
>  * LUCENE-3002: added 'tests.iter.min' to control 'tests.iter' by allowing
> to
>
>
>

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036641#comment-13036641
 ] 

Chris Male commented on LUCENE-2883:


{quote}
Is this desirable? IndexSearcher is pretty thin I know but is it fast enough to 
create that it has a nominal effect? If its faster than passing down the 
IndexSearcher then maybe its a good idea for anybody wanting an IndexSearcher 
to do this.
{quote}

Actually one flaw in this is that SolrIndexSearcher has overridden some 
functionality in IndexSearcher, such as SimilarityProvider, which is used 
NormValueSource.

> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036637#comment-13036637
 ] 

Chris Male commented on LUCENE-2883:


I've opened SOLR-2533 to look at ways to standardise the ValueSource sort 
weighting API.

> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2533) Improve API of ValueSource & FunctionQuery SortField weighting

2011-05-19 Thread Chris Male (JIRA)

Improve API of ValueSource & FunctionQuery SortField weighting
--

 Key: SOLR-2533
 URL: https://issues.apache.org/jira/browse/SOLR-2533
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Chris Male


Started from LUCENE-2883: Support for sorting by ValueSource and 
FunctionQueries is done through ValueSource#getSort and the 
ValueSourceSortField.  In order to support VSs containing other Queries, its 
necessary to allow the Querys to be weighted by an IndexSearcher.  Currently 
this is handled by having ValueSourceSortField implement SolrSortField.  In 
Solr's SolrIndexSearcher, SortFields implementing SolrSortField are then 
weighted before the Sort is used.

Sorting by FunctionQuery and ValueSource are invaluable and will become 
available to all Lucene users in LUCENE-2883.  But in order to do so, we need 
to remove the coupling of this functionality to Solr, and make it more standard.

Any and all thoughts about how to do this are appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036633#comment-13036633
 ] 

Chris Male commented on LUCENE-2883:


Hey Yonik,

Its super to hear from you on this, it'll be a real help.

{quote}
Regarding weighting - function queries can contain normal queries, so anywhere 
a function query is used, it must be weighted first.
{quote}

Yup I've come to understand that.  So the challenge is how to do this when a 
FunctionQuery is used to sort and not Query? Okay.  I'm going to open an issue 
to see if we can address this better, maybe by extending SortField or something.

{quote}
Sort instances are like Query instances, and for many reasons should not be 
bound to any particular searcher.
{quote}

Yeah that is true.  But ValueSource#getSort actually returns a SortField.  Does 
the same apply to SortField instances?  Also, ValueSource#weight(IndexSearcher) 
returns a new SortField as well, with a context containing the IndexSearcher.  
Consequently the new SortField is bound to that particular searcher.

> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036631#comment-13036631
 ] 

Chris Male commented on LUCENE-2883:


{quote}
So the goal here is to make the top-level searcher (IR) visible to the
FQ's getValues? I think this pre-dated the cutover to
AtomicReaderContext, which now provides the top reader? Maybe this
isn't needed anymore...?
{quote}

Its very intentional (as Yonik has pointed out).  It allows any Querys in the 
ValueSources to be weighted.

{quote}
Though QueryValueSource needs a searcher (but, seems to make one, from
the top reader, if it wasn't provided one).
{quote}

Is this desirable? IndexSearcher is pretty thin I know but is it fast enough to 
create that it has a nominal effect? If its faster than passing down the 
IndexSearcher then maybe its a good idea for anybody wanting an IndexSearcher 
to do this.

{quote}
Good question... maybe we can do this on a branch?
{quote}

Absolutely, can you create one?


> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-docvalues-branch - Build # 1147 - Still Failing

2011-05-19 Thread Apache Jenkins Server

Build: 
https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-docvalues-branch/1147/

No tests ran.

Build Log (for compile errors):
[...truncated 28 lines...]
+ TEST_LINE_DOCS_FILE=/home/hudson/lucene-data/enwiki.random.lines.txt.gz
+ TEST_JVM_ARGS='-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/'
+ set +x
Checking for files containing nocommit:
./lucene/src/java/org/apache/lucene/index/values/DocValues.java
+ mkdir -p 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps
+ rm -rf 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/README.txt
+ echo 'This directory contains heap dumps that may be generated by test runs 
when OOM occurred.'
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant clean
Buildfile: build.xml

clean:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
 [echo] Building analyzers-icu...

clean:
 [echo] Building analyzers-phonetic...

clean:
 [echo] Building analyzers-smartcn...

clean:
 [echo] Building analyzers-stempel...

clean:
 [echo] Building benchmark...

clean:
 [echo] Building grouping...

clean:

clean-contrib:

clean:

clean:

clean:

clean:

clean:

clean:

BUILD SUCCESSFUL
Total time: 3 seconds
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] Compiling 536 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/index/PerFieldCodecWrapper.java:309:
 cannot find symbol
[javac] symbol  : constructor IOException(java.io.IOException)
[javac] location: class java.io.IOException
[javac] err = new IOException(ioe);
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
[...truncated 11 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2532) add hook to allow explicit calls to IndexWriter.deleteUnusedFiles()

2011-05-19 Thread Hoss Man (JIRA)

add hook to allow explicit calls to IndexWriter.deleteUnusedFiles()
---

 Key: SOLR-2532
 URL: https://issues.apache.org/jira/browse/SOLR-2532
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man


It would be nice if we gave Solr Windows users some way to deal with the 
"unused files" problem explicitly.

This could either come in the form of a new automatic action in a background 
thread (when Solr knows it's opened a new reader, it can tell IndexWRiter to 
explicitly clean up the old files) or perhaps as an explicit action (maybe an 
Event Listener that could be configured postCommit or newSearcher?)



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2528) remove HtmlEncoder from example solrconfig.xml (or set it to default=false)

2011-05-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-2528.
--

Resolution: Fixed

trunk: Committed revision 1125150, 1125156(move change log entry).
3x: Committed revision 1125158.
3.1: Committed revision 1125161.

> remove HtmlEncoder from example solrconfig.xml (or set it to default=false)
> ---
>
> Key: SOLR-2528
> URL: https://issues.apache.org/jira/browse/SOLR-2528
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 3.1
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Trivial
> Fix For: 3.1.1, 3.2, 4.0
>
> Attachments: SOLR-2528.patch
>
>
> After 3.1 released, highlight snippets that include non ascii characters are 
> encoded to character references by HtmlEncoder if it is set in 
> solrconfig.xml. Because solr example config has it, not a few users got 
> confused by the output.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3120) span query matches too many docs when two query terms are the same unless inOrder=true

2011-05-19 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036540#comment-13036540
 ] 

Hoss Man commented on LUCENE-3120:
--

What we might want to consider is a new option on SpanNearQuery that would 
mandate that the spans not overlap.

Paul Elschot described the general form of this idea once as an numeric option 
to specify a minimum distance between the subspans (so the default, as 
implemented today, for inOrder==true would be minPositionDistance=1; and the 
default for inOrder==false would be minPositionDistance=0)




> span query matches too many docs when two query terms are the same unless 
> inOrder=true
> --
>
> Key: LUCENE-3120
> URL: https://issues.apache.org/jira/browse/LUCENE-3120
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Doron Cohen
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3120.patch, LUCENE-3120.patch
>
>
> spinoff of user list discussion - [SpanNearQuery - inOrder 
> parameter|http://markmail.org/message/i4cstlwgjmlcfwlc].
> With 3 documents:
> *  "a b x c d"
> *  "a b b d"
> *  "a b x b y d"
> Here are a few queries (the number in parenthesis indicates expected #hits):
> These ones work *as expected*:
> * (1)  in-order, slop=0, "b", "x", "b"
> * (1)  in-order, slop=0, "b", "b"
> * (2)  in-order, slop=1, "b", "b"
> These ones match *too many* hits:
> * (1)  any-order, slop=0, "b", "x", "b"
> * (1)  any-order, slop=1, "b", "x", "b"
> * (1)  any-order, slop=2, "b", "x", "b"
> * (1)  any-order, slop=3, "b", "x", "b"
> These ones match *too many* hits as well:
> * (1)  any-order, slop=0, "b", "b"
> * (2)  any-order, slop=1, "b", "b"
> Each of the above passes when using a phrase query (applying the slop, no 
> in-order indication in phrase query).
> This seems related to a known overlapping spans issue - [non-overlapping Span 
> queries|http://markmail.org/message/7jxn5eysjagjwlon] - as indicated by Hoss, 
> so we might decide to close this bug after all, but I would like to at least 
> have the junit that exposes the behavior in JIRA.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3120) span query matches too many docs when two query terms are the same unless inOrder=true

2011-05-19 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036538#comment-13036538
 ] 

Hoss Man commented on LUCENE-3120:
--

comment i made on the mailing list regarding this topic...

{quote}
the crux of hte issue (as i recall) is that there is really no conecptual 
reason to why a query for "'john' near 'john', in any order, with slop of Z" 
shouldn't match a doc that contains only one instance of "john" ... the first 
SpanTermQuery says "i found a match at position X" the second SpanTermQuery 
says "i found a match at position Y" and the SpanNearQuery says "the differnece 
between X and Y is less then Z" therefore i have a match. (The SpanNearQuery 
can't fail just because X and Y are the same -- they might be two distinct term 
instances, with differnet payloads perhaps, that just happen to have the same 
position).

However: if true==inOrder case works because the SpanNearQuery enforces that "X 
must be less then Y" so the same term can't ever match twice. 
{quote}

> span query matches too many docs when two query terms are the same unless 
> inOrder=true
> --
>
> Key: LUCENE-3120
> URL: https://issues.apache.org/jira/browse/LUCENE-3120
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Doron Cohen
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3120.patch, LUCENE-3120.patch
>
>
> spinoff of user list discussion - [SpanNearQuery - inOrder 
> parameter|http://markmail.org/message/i4cstlwgjmlcfwlc].
> With 3 documents:
> *  "a b x c d"
> *  "a b b d"
> *  "a b x b y d"
> Here are a few queries (the number in parenthesis indicates expected #hits):
> These ones work *as expected*:
> * (1)  in-order, slop=0, "b", "x", "b"
> * (1)  in-order, slop=0, "b", "b"
> * (2)  in-order, slop=1, "b", "b"
> These ones match *too many* hits:
> * (1)  any-order, slop=0, "b", "x", "b"
> * (1)  any-order, slop=1, "b", "x", "b"
> * (1)  any-order, slop=2, "b", "x", "b"
> * (1)  any-order, slop=3, "b", "x", "b"
> These ones match *too many* hits as well:
> * (1)  any-order, slop=0, "b", "b"
> * (2)  any-order, slop=1, "b", "b"
> Each of the above passes when using a phrase query (applying the slop, no 
> in-order indication in phrase query).
> This seems related to a known overlapping spans issue - [non-overlapping Span 
> queries|http://markmail.org/message/7jxn5eysjagjwlon] - as indicated by Hoss, 
> so we might decide to close this bug after all, but I would like to at least 
> have the junit that exposes the behavior in JIRA.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SpanNearQuery - inOrder parameter

2011-05-19 Thread Chris Hostetter

: 
: I attach a junit test which shows strange behaviour of the inOrder
: parameter on the SpanNearQuery constructor, using Lucene 2.9.4.

De-Ja-Vu? ... didn't we just talk about this on java-user recently?

Here was my comment...

http://markmail.org/search/?q=SpanNearQuery+inOrder#query:SpanNearQuery%20inOrder+page:1+mid:zhhlhojw55bf43r6+state:results

...i guess this is actually the same thread now on the dev list?

As i said before: i don't really see how this is bug.  when you specify 
inOrder=true, you are requiring that the positions of the subspans must 
be "in order" so position_span_1 < position_span_2.  when inOrder=false 
there is nothing stoping position_span_1 == position_span_2 ... the fact 
that span_1 == span_2 doesn't matter.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SolrCloud Distributed Indexing

2011-05-19 Thread Yonik Seeley

2011/5/19 Yury Kats :
> I'm curious to know whether Distributed Indexing is on the roadmap for 4.0.
> Looking at JIRA and past dev list discussions, there seem to have been
> efforts around DI in the past, but no recent activity.
>
> Is this being worked on, postponed or deemed not important enough all 
> together?

Yes, some people have started some work on it, and it is very
important, and I plan to start tackling it myself after Lucene
Revolution.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-05-19 Thread Lance Norskog (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036505#comment-13036505
 ] 

Lance Norskog commented on SOLR-2155:
-

Where is "lucene-spatial-playground"?

> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036501#comment-13036501
 ] 

Yonik Seeley commented on LUCENE-2883:
--

Regarding weighting - function queries can contain normal queries, so anywhere 
a function query is used, it must be weighted first.

bq. When ValueSource#getSort is called (which is only in 1 place really), we 
can pass in the IndexSearcher, meaning the SortField can be 'weighted' then and 
there.

Sort instances are like Query instances, and for many reasons should not be 
bound to any particular searcher.

> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

SolrCloud Distributed Indexing

2011-05-19 Thread Yury Kats

Hi,

I'm curious to know whether Distributed Indexing is on the roadmap for 4.0.
Looking at JIRA and past dev list discussions, there seem to have been
efforts around DI in the past, but no recent activity.

Is this being worked on, postponed or deemed not important enough all together?

Thanks,
Yury

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036481#comment-13036481
 ] 

Michael McCandless commented on LUCENE-3126:


Cool idea!

> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> -
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036482#comment-13036482
 ] 

Michael McCandless commented on LUCENE-3123:


Thanks for raising it Doron!

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
> Fix For: 3.2, 4.0
>
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2883) Consolidate Solr & Lucene FunctionQuery into modules

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036479#comment-13036479
 ] 

Michael McCandless commented on LUCENE-2883:


{quote}
bq. Hmm good question. This looks to be related to sorting by FQ (SOLR-1297) 
because some FQs need to be weighted. Not sure what to do here yet... which FQs 
in particular require this?

Both all of them and not many of them (complicated). The sorting of FQ
functionality is necessary for all FQs in Solr since the user can sort
by any FQ. However the extension made by the SolrSortField is the ability
to create a 'weighted' SortField by passing in a IndexSearcher and having
it stored in a Map. The Map is then made available to any DocValues when
they create their values.
{quote}

So the goal here is to make the top-level searcher (IR) visible to the
FQ's getValues?  I think this pre-dated the cutover to
AtomicReaderContext, which now provides the top reader?  Maybe this
isn't needed anymore...?

{quote}
This is when the 'not many' comes into effect. Only a few DocValues 
implementations
use the contents of the Map. DocFreqValueSource for example uses the 
IndexSearcher
in the Map. But I suppose there could be many user implementations that do.
{quote}

EG DocFreqValueSource could pull docFreq from the top reader?

Though QueryValueSource needs a searcher (but, seems to make one, from
the top reader, if it wasn't provided one).

{quote}
SolrSortField is currently used in SolrIndexSearcher to 'weight' the Sorts. I 
wonder
whether we can simplify this? When ValueSource#getSort is called (which is only 
in 1
place really), we can pass in the IndexSearcher, meaning the SortField can be 
'weighted'
then and there.

Since SolrSortField is only used in this 1 place currently, we can then revisit 
dropping it?
{quote}

That seems good too?

bq. Do you think its worth opening an issue to address this first?

Yes can you do that, and mark this issue as depending in it?

{quote}
bq. I think apply 90/10 rule here? Start with the easy-to-move queries? We 
don't need initial go to be perfect... progress not perfection.

Could we sort the initial commit out and then I can move them over in batches?
Already have a 108k patch, I'd say moving what we can will push it towards 300k
{quote}

Good question... maybe we can do this on a branch?

{quote}
bq. Do you have a sense of whether Solr's FQs are a superset of Lucene's? Ie, 
is there anything Lucene's FQs can do that Solr's can't?

Solr FQs are hugely more advanced than the ValueSourceQuery based stuff in 
Lucene.
Its not a full 1 to 1 change since the APIs are slightly different, but I'd say
that we'd want users to use the FQ line of classes. I cant see anything in 
Lucene's
VSQs that you couldn't do using FQs.
{quote}

OK then once we finally have the "superset" moved into the module we
should remove Lucene's (deprecate on 3.x).


> Consolidate Solr  & Lucene FunctionQuery into modules
> -
>
> Key: LUCENE-2883
> URL: https://issues.apache.org/jira/browse/LUCENE-2883
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Fix For: 4.0
>
> Attachments: LUCENE-2883.patch
>
>
> Spin-off from the [dev list | 
> http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html]  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-19 Thread Shai Erera (JIRA)

IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
already
-

 Key: LUCENE-3126
 URL: https://issues.apache.org/jira/browse/LUCENE-3126
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Priority: Minor
 Fix For: 3.2, 4.0


Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
segments. However, if IndexWriter's MP wants to create CFS (in general), 
there's no reason why not turn the incoming non-CFS segments into CFS. We 
anyway copy them, and if MP is not against CFS, we should create a CFS out of 
them.

Will need to use CFW, not sure it's ready for that w/ current API (I'll need to 
check), but luckily we're allowed to change it (@lucene.internal).

This should be done, IMO, even if the incoming segment is large (i.e., passes 
MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you think 
otherwise, speak up :).

I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2500) TestSolrProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Steven Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated SOLR-2500:
--

Summary: TestSolrProperties sometimes fails with "no such core: core0"  
(was: TestSolrCoreProperties sometimes fails with "no such core: core0")

fixed title

> TestSolrProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3125) TestDocValuesIndexing.testAddIndexes failures on docvalues branch

2011-05-19 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3125.
-

   Resolution: Fixed
Fix Version/s: CSF branch

Committed in revision 1125108. Thanks Selckin - I will make sure I not you in 
the CHANGES.TXT for all your testing help!


> TestDocValuesIndexing.testAddIndexes failures on docvalues branch
> -
>
> Key: LUCENE-3125
> URL: https://issues.apache.org/jira/browse/LUCENE-3125
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: selckin
>Assignee: Simon Willnauer
> Fix For: CSF branch
>
> Attachments: LUCENE-3125.patch
>
>
> doc values branch r1124825, reproducible 
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.716 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=5939035003978436534:-6429764582682717131
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=MockRandom, 
> BYTES_VAR_DEREF=MockRandom, INTS=Pulsing(freqCutoff=13)}, locale=da_DK, 
> timezone=Asia/Macao
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88582432,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] null
> [junit] java.nio.channels.ClosedChannelException
> [junit] at 
> sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
> [junit] at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:603)
> [junit] at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
> [junit] at 
> org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:162)
> [junit] at 
> org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readLong(BufferedIndexInput.java:175)
> [junit] at 
> org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:136)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:263)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:249)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsReader.getEnum(PackedIntsImpl.java:239)
> [junit] at 
> org.apache.lucene.index.values.DocValues.getEnum(DocValues.java:54)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.getValuesEnum(TestDocValuesIndexing.java:484)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.testAddIndexes(TestDocValuesIndexing.java:202)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1304)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1233)
> [junit] 
> [junit] 
> [junit] Test org.apache.lucene.index.values.TestDocValuesIndexing FAILED
> {code}
> and
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.94 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=-3677966427932339626:-4746638811786223564
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=Standard, 
> BYTES_FIXED_DEREF=MockSep, FLOAT_64=SimpleText}, locale=ca, 
> timezone=Asia/Novosibirsk
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88596152,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] Bad file descriptor
> [junit] java.io.IOE

[jira] [Updated] (LUCENE-3125) TestDocValuesIndexing.testAddIndexes failures on docvalues branch

2011-05-19 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3125:


Attachment: LUCENE-3125.patch

good catch selkin here is a patch

> TestDocValuesIndexing.testAddIndexes failures on docvalues branch
> -
>
> Key: LUCENE-3125
> URL: https://issues.apache.org/jira/browse/LUCENE-3125
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: selckin
>Assignee: Simon Willnauer
> Attachments: LUCENE-3125.patch
>
>
> doc values branch r1124825, reproducible 
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.716 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=5939035003978436534:-6429764582682717131
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=MockRandom, 
> BYTES_VAR_DEREF=MockRandom, INTS=Pulsing(freqCutoff=13)}, locale=da_DK, 
> timezone=Asia/Macao
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88582432,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] null
> [junit] java.nio.channels.ClosedChannelException
> [junit] at 
> sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
> [junit] at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:603)
> [junit] at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
> [junit] at 
> org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:162)
> [junit] at 
> org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readLong(BufferedIndexInput.java:175)
> [junit] at 
> org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:136)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:263)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:249)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsReader.getEnum(PackedIntsImpl.java:239)
> [junit] at 
> org.apache.lucene.index.values.DocValues.getEnum(DocValues.java:54)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.getValuesEnum(TestDocValuesIndexing.java:484)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.testAddIndexes(TestDocValuesIndexing.java:202)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1304)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1233)
> [junit] 
> [junit] 
> [junit] Test org.apache.lucene.index.values.TestDocValuesIndexing FAILED
> {code}
> and
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.94 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=-3677966427932339626:-4746638811786223564
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=Standard, 
> BYTES_FIXED_DEREF=MockSep, FLOAT_64=SimpleText}, locale=ca, 
> timezone=Asia/Novosibirsk
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88596152,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] Bad file descriptor
> [junit] java.io.IOException: Bad file descriptor
> [junit] at java.io.RandomAccessFile.seek(Native Method)
> [junit] at 
> org.apache.lucene.store.

[jira] [Assigned] (LUCENE-3125) TestDocValuesIndexing.testAddIndexes failures on docvalues branch

2011-05-19 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-3125:
---

Assignee: Simon Willnauer

> TestDocValuesIndexing.testAddIndexes failures on docvalues branch
> -
>
> Key: LUCENE-3125
> URL: https://issues.apache.org/jira/browse/LUCENE-3125
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: selckin
>Assignee: Simon Willnauer
>
> doc values branch r1124825, reproducible 
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.716 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=5939035003978436534:-6429764582682717131
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=MockRandom, 
> BYTES_VAR_DEREF=MockRandom, INTS=Pulsing(freqCutoff=13)}, locale=da_DK, 
> timezone=Asia/Macao
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88582432,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] null
> [junit] java.nio.channels.ClosedChannelException
> [junit] at 
> sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
> [junit] at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:603)
> [junit] at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
> [junit] at 
> org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:162)
> [junit] at 
> org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readLong(BufferedIndexInput.java:175)
> [junit] at 
> org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:136)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:263)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:249)
> [junit] at 
> org.apache.lucene.index.values.PackedIntsImpl$IntsReader.getEnum(PackedIntsImpl.java:239)
> [junit] at 
> org.apache.lucene.index.values.DocValues.getEnum(DocValues.java:54)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.getValuesEnum(TestDocValuesIndexing.java:484)
> [junit] at 
> org.apache.lucene.index.values.TestDocValuesIndexing.testAddIndexes(TestDocValuesIndexing.java:202)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1304)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1233)
> [junit] 
> [junit] 
> [junit] Test org.apache.lucene.index.values.TestDocValuesIndexing FAILED
> {code}
> and
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.94 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=-3677966427932339626:-4746638811786223564
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=Standard, 
> BYTES_FIXED_DEREF=MockSep, FLOAT_64=SimpleText}, locale=ca, 
> timezone=Asia/Novosibirsk
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88596152,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] Bad file descriptor
> [junit] java.io.IOException: Bad file descriptor
> [junit] at java.io.RandomAccessFile.seek(Native Method)
> [junit] at 
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:101)
>

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036462#comment-13036462
 ] 

Robert Muir commented on SOLR-2500:
---

+1 to commit! Thanks Doron!

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources

2011-05-19 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036457#comment-13036457
 ] 

Bill Bell commented on SOLR-2371:
-

Can we add a alpha sort min and max for non-numerical multiValued fields?

The use case is sort by As before Bs in the field.

I do a fq=field:a* then I add sort=min(field) asc

Thanks







> Add a min() function query, upgrade max() function query to take two value 
> sources
> --
>
> Key: SOLR-2371
> URL: https://issues.apache.org/jira/browse/SOLR-2371
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: SOLR-2371.patch
>
>
> There doesn't appear to be a min() function.  Also, max() only allows a value 
> source and a constant b/c it is from before we had more flexible parsing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036454#comment-13036454
 ] 

Steven Rowe commented on SOLR-2500:
---

I ran the TestSolrProperties test case alone, without the patch, from both 
Maven and IntelliJ, and I get the same behavior: it passes once, then fails on 
every successive attempt, unless "clean" is performed first.

With the patch applied, the test passes under both IntelliJ and Maven, 
regardless of whether "clean" is performed first.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr

2011-05-19 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036441#comment-13036441
 ] 

Grant Ingersoll commented on LUCENE-3104:
-

Here's an example of running just the test-patch.sh:
{quote}
./test-patch.sh DEV ../../../patches/LUCENE-3120.patch 
/path/trunk-clean/build/patches /usr/bin/svn /usr/bin/grep /usr/bin/patch 
/path/lucene/dev/trunk-clean
{quote}

Note, the directory where you are applying the patch (trunk-clean) must be 
clear of all mods.

> Hook up Automated Patch Checking for Lucene/Solr
> 
>
> Key: LUCENE-3104
> URL: https://issues.apache.org/jira/browse/LUCENE-3104
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Grant Ingersoll
> Attachments: LUCENE-3104.patch
>
>
> It would be really great if we could get feedback to contributors sooner on 
> many things that are basic (tests exist, patch applies cleanly, etc.)
> From Nigel Daley on builds@a.o
> {quote}
> I revamped the precommit testing in the fall so that it doesn't use Jira 
> email anymore to trigger a build.  The process is controlled by
> https://builds.apache.org/hudson/job/PreCommit-Admin/
> which has some documentation up at the top of the job.  You can look at the 
> config of the job (do you have access?) to see what it's doing.  Any project 
> could use this same admin job -- you just need to ask me to add the project 
> to the Jira filter used by the admin job 
> (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100
>  ) once you have the downstream job(s) setup for your specific project.  For 
> Hadoop we have 3 downstream builds configured which also have some 
> documentation:
> https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/
> https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/
> https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-docvalues-branch - Build # 1146 - Still Failing

2011-05-19 Thread Apache Jenkins Server

Build: 
https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-docvalues-branch/1146/

No tests ran.

Build Log (for compile errors):
[...truncated 27 lines...]
+ TEST_LINE_DOCS_FILE=/home/hudson/lucene-data/enwiki.random.lines.txt.gz
+ TEST_JVM_ARGS='-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/'
+ set +x
Checking for files containing nocommit:
./lucene/src/java/org/apache/lucene/index/values/DocValues.java
+ mkdir -p 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps
+ rm -rf 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/heapdumps/README.txt
+ echo 'This directory contains heap dumps that may be generated by test runs 
when OOM occurred.'
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant clean
Buildfile: build.xml

clean:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
 [echo] Building analyzers-icu...

clean:
 [echo] Building analyzers-phonetic...

clean:
 [echo] Building analyzers-smartcn...

clean:
 [echo] Building analyzers-stempel...

clean:
 [echo] Building benchmark...

clean:
 [echo] Building grouping...

clean:

clean-contrib:

clean:

clean:

clean:

clean:

clean:

clean:

BUILD SUCCESSFUL
Total time: 1 second
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] Compiling 536 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/index/PerFieldCodecWrapper.java:309:
 cannot find symbol
[javac] symbol  : constructor IOException(java.lang.Exception)
[javac] location: class java.io.IOException
[javac] err = new IOException(ioe);
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error
[...truncated 11 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3120) span query matches too many docs when two query terms are the same unless inOrder=true

2011-05-19 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-3120:


Attachment: LUCENE-3120.patch

Updated patch with fixed test to not depend on analysis module.

> span query matches too many docs when two query terms are the same unless 
> inOrder=true
> --
>
> Key: LUCENE-3120
> URL: https://issues.apache.org/jira/browse/LUCENE-3120
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Doron Cohen
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3120.patch, LUCENE-3120.patch
>
>
> spinoff of user list discussion - [SpanNearQuery - inOrder 
> parameter|http://markmail.org/message/i4cstlwgjmlcfwlc].
> With 3 documents:
> *  "a b x c d"
> *  "a b b d"
> *  "a b x b y d"
> Here are a few queries (the number in parenthesis indicates expected #hits):
> These ones work *as expected*:
> * (1)  in-order, slop=0, "b", "x", "b"
> * (1)  in-order, slop=0, "b", "b"
> * (2)  in-order, slop=1, "b", "b"
> These ones match *too many* hits:
> * (1)  any-order, slop=0, "b", "x", "b"
> * (1)  any-order, slop=1, "b", "x", "b"
> * (1)  any-order, slop=2, "b", "x", "b"
> * (1)  any-order, slop=3, "b", "x", "b"
> These ones match *too many* hits as well:
> * (1)  any-order, slop=0, "b", "b"
> * (2)  any-order, slop=1, "b", "b"
> Each of the above passes when using a phrase query (applying the slop, no 
> in-order indication in phrase query).
> This seems related to a known overlapping spans issue - [non-overlapping Span 
> queries|http://markmail.org/message/7jxn5eysjagjwlon] - as indicated by Hoss, 
> so we might decide to close this bug after all, but I would like to at least 
> have the junit that exposes the behavior in JIRA.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated SOLR-2500:
--

Attachment: SOLR-2500.patch

Attached patch, test passes now in both IDE and cmd line:

* at setup() copies solr.xml to a private file. 

* use that private file as its solr.solr.home.

* erase that file at tearDown(), though not erasing it
  should not affect on further/re/tests.

* fixes the deletion at tearDown() to look at 
  solr.solr.home rather than solr.home.
  (I think this was a bug on a bug in this test - it used the
  original file at s.s.h but for cleanup 
  attempted to remove files from just s.h.

This debugging took place in pure darkness, better review...

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3125) TestDocValuesIndexing.testAddIndexes failures on docvalues branch

2011-05-19 Thread selckin (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036378#comment-13036378
 ] 

selckin commented on LUCENE-3125:
-

{code}
[junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
[junit] Testcase: 
testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
Caused an ERROR
[junit] MMapIndexInput already closed
[junit] org.apache.lucene.store.AlreadyClosedException: MMapIndexInput 
already closed
[junit] at 
org.apache.lucene.store.MMapDirectory$MMapIndexInput.clone(MMapDirectory.java:291)
[junit] at 
org.apache.lucene.store.MockIndexInputWrapper.clone(MockIndexInputWrapper.java:68)
[junit] at 
org.apache.lucene.index.values.Bytes$BytesReaderBase.cloneData(Bytes.java:454)
[junit] at 
org.apache.lucene.index.values.VarSortedBytesImpl$Reader.getEnum(VarSortedBytesImpl.java:234)
[junit] at 
org.apache.lucene.index.values.DocValues.getEnum(DocValues.java:54)
[junit] at 
org.apache.lucene.index.values.TestDocValuesIndexing.getValuesEnum(TestDocValuesIndexing.java:484)
[junit] at 
org.apache.lucene.index.values.TestDocValuesIndexing.testAddIndexes(TestDocValuesIndexing.java:203)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1304)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1233)
[junit] 
[junit] 
[junit] Tests run: 10, Failures: 0, Errors: 1, Time elapsed: 16.858 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
-Dtestmethod=testAddIndexes -Dtests.seed=2717387190169859484:6990694723425578308
[junit] NOTE: test params are: codec=RandomCodecProvider: 
{id=Pulsing(freqCutoff=14), FLOAT_32=Pulsing(freqCutoff=14), 
BYTES_FIXED_DEREF=MockSep, BYTES_VAR_DEREF=MockFixedIntBlock(blockSize=1259), 
BYTES_VAR_SORTED=MockSep, 
BYTES_VAR_STRAIGHT=MockVariableIntBlock(baseBlockSize=15), 
BYTES_FIXED_STRAIGHT=MockVariableIntBlock(baseBlockSize=15), docId=MockSep, 
FLOAT_64=Standard, INTS=Standard, BYTES_FIXED_SORTED=MockSep}, locale=uk_UA, 
timezone=EST
[junit] NOTE: all tests run in this JVM:
[junit] [TestAssertions, TestDemo, TestCachingTokenFilter, TestDocument, 
TestDeletionPolicy, TestDirectoryReader, TestDocumentWriter, TestFlex, 
TestIndexReaderCloneNorms, TestLongPostings, TestMultiReader, TestOmitTf, 
TestRollingUpdates, TestSegmentTermEnum, TestDocValuesIndexing]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
(64-bit)/cpus=8,threads=1,free=165611040,total=263258112
[junit] -  ---
[junit] TEST org.apache.lucene.index.values.TestDocValuesIndexing FAILED
{code}

> TestDocValuesIndexing.testAddIndexes failures on docvalues branch
> -
>
> Key: LUCENE-3125
> URL: https://issues.apache.org/jira/browse/LUCENE-3125
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: selckin
>
> doc values branch r1124825, reproducible 
> {code}
> [junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.716 sec
> [junit] 
> [junit] - Standard Error -
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
> -Dtestmethod=testAddIndexes 
> -Dtests.seed=5939035003978436534:-6429764582682717131
> [junit] NOTE: test params are: codec=RandomCodecProvider: {id=MockRandom, 
> BYTES_VAR_DEREF=MockRandom, INTS=Pulsing(freqCutoff=13)}, locale=da_DK, 
> timezone=Asia/Macao
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestDocValuesIndexing]
> [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
> (64-bit)/cpus=8,threads=1,free=88582432,total=125632512
> [junit] -  ---
> [junit] Testcase: 
> testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
> Caused an ERROR
> [junit] null
> [junit] java.nio.channels.ClosedChannelException
> [junit] at 
> sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
> [junit] at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:603)
> [junit] at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
> [junit] at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
> [junit] at 
> org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
> [junit] at 
> org

Re: FST and FieldCache?

2011-05-19 Thread Dawid Weiss

> Sadly, I won't be at Lucene Revolution next week. That's where all the cool

Ah, pity. Next time.

> Yes, the use-case here is a unique integer reference to a String that can be
> looked up fairly quickly, whereas the set of all strings are in a compressed
> data structure that won't change after its built.

Ok, so a short answer is: it's possible ;) Mike filed an issue for it,
so whoever finds the time first, it's open (I'll try myself, but I'll
be away for an extended weekend and then there's Lucene Revolution; we
will see).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3125) TestDocValuesIndexing.testAddIndexes failures on docvalues branch

2011-05-19 Thread selckin (JIRA)

TestDocValuesIndexing.testAddIndexes failures on docvalues branch
-

 Key: LUCENE-3125
 URL: https://issues.apache.org/jira/browse/LUCENE-3125
 Project: Lucene - Java
  Issue Type: Bug
Reporter: selckin


doc values branch r1124825, reproducible 
{code}
[junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.716 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
-Dtestmethod=testAddIndexes 
-Dtests.seed=5939035003978436534:-6429764582682717131
[junit] NOTE: test params are: codec=RandomCodecProvider: {id=MockRandom, 
BYTES_VAR_DEREF=MockRandom, INTS=Pulsing(freqCutoff=13)}, locale=da_DK, 
timezone=Asia/Macao
[junit] NOTE: all tests run in this JVM:
[junit] [TestDocValuesIndexing]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
(64-bit)/cpus=8,threads=1,free=88582432,total=125632512
[junit] -  ---
[junit] Testcase: 
testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
Caused an ERROR
[junit] null
[junit] java.nio.channels.ClosedChannelException
[junit] at 
sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
[junit] at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:603)
[junit] at 
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:161)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
[junit] at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:162)
[junit] at 
org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.readLong(BufferedIndexInput.java:175)
[junit] at 
org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:136)
[junit] at 
org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:263)
[junit] at 
org.apache.lucene.index.values.PackedIntsImpl$IntsEnumImpl.(PackedIntsImpl.java:249)
[junit] at 
org.apache.lucene.index.values.PackedIntsImpl$IntsReader.getEnum(PackedIntsImpl.java:239)
[junit] at 
org.apache.lucene.index.values.DocValues.getEnum(DocValues.java:54)
[junit] at 
org.apache.lucene.index.values.TestDocValuesIndexing.getValuesEnum(TestDocValuesIndexing.java:484)
[junit] at 
org.apache.lucene.index.values.TestDocValuesIndexing.testAddIndexes(TestDocValuesIndexing.java:202)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1304)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1233)
[junit] 
[junit] 
[junit] Test org.apache.lucene.index.values.TestDocValuesIndexing FAILED
{code}

and

{code}

[junit] Testsuite: org.apache.lucene.index.values.TestDocValuesIndexing
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.94 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestDocValuesIndexing 
-Dtestmethod=testAddIndexes 
-Dtests.seed=-3677966427932339626:-4746638811786223564
[junit] NOTE: test params are: codec=RandomCodecProvider: {id=Standard, 
BYTES_FIXED_DEREF=MockSep, FLOAT_64=SimpleText}, locale=ca, 
timezone=Asia/Novosibirsk
[junit] NOTE: all tests run in this JVM:
[junit] [TestDocValuesIndexing]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 
(64-bit)/cpus=8,threads=1,free=88596152,total=125632512
[junit] -  ---
[junit] Testcase: 
testAddIndexes(org.apache.lucene.index.values.TestDocValuesIndexing): 
Caused an ERROR
[junit] Bad file descriptor
[junit] java.io.IOException: Bad file descriptor
[junit] at java.io.RandomAccessFile.seek(Native Method)
[junit] at 
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:101)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:222)
[junit] at 
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39)
[junit] at 
org.apache.lucene.store.MockIndexInputWrapper.readByte(MockIndexInputWrapper.java:105)
[junit] at 
org.apache.lucene.index.values.Floats$FloatsReader.load(Floats.java:281)
[junit] at 
org.apache.lucene.index.values.SourceCache$

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036338#comment-13036338
 ] 

Robert Muir commented on SOLR-2500:
---

ok, so really the right way to fix this I think, is to ensure the test is only 
working from its private dir and copies stuff it needs in there.

Then this will work fine from the IDE too (the patch only causes ant to recopy 
a clean version the next time you run 'test')

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-3123.
-

   Resolution: Fixed
Fix Version/s: 4.0
   3.2

Fixed by Mike, thanks Mike!

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
> Fix For: 3.2, 4.0
>
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036333#comment-13036333
 ] 

Yonik Seeley edited comment on SOLR-2500 at 5/19/11 6:05 PM:
-

I don't have much time to look at this right now (and I don't really know the 
test), but I just tried running it directly from intellij and that failed also.

First, note that it tries to use something off the CWD... but then the core 
container is created under build/tests:
{code}
INFO: pwd: /opt/code/lusolr/.
2011-05-19 20.52.58 org.apache.solr.core.CoreContainer 
INFO: New CoreContainer 746169063
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/opt/code/lusolr/solr/build/tests/solr/shared/'
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
{code}

This causes a problem when it comes time to delete later:
{code}
WARNING:  WARNING: best effort to remove /opt/code/lusolr/solr/shared/data 
FAILED !
{code}

Of course the weird thing is that tearDown() only tries to delete the data 
directory and not the whole solr home... this seems incorrect?
That would lead to leaving around an old solr.xml file (since it's outside the 
data directory) and could cause issues the next time the test is run.

edit: crossed messages w/ robert above - looks like the issue has already been 
found+fixed.

  was (Author: ysee...@gmail.com):
I don't have much time to look at this right now (and I don't really know 
the test), but I just tried running it directly from intellij and that failed 
also.

First, note that it tries to use something off the CWD... but then the core 
container is created under build/tests:
{code}
INFO: pwd: /opt/code/lusolr/.
2011-05-19 20.52.58 org.apache.solr.core.CoreContainer 
INFO: New CoreContainer 746169063
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/opt/code/lusolr/solr/build/tests/solr/shared/'
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
{code}

This causes a problem when it comes time to delete later:
{code}
WARNING:  WARNING: best effort to remove /opt/code/lusolr/solr/shared/data 
FAILED !
{code}

Of course the weird thing is that tearDown() only tries to delete the data 
directory and not the whole solr home... this seems incorrect?
That would lead to leaving around an old solr.xml file (since it's outside the 
data directory) and could cause issues the next time the test is run.
  
> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036333#comment-13036333
 ] 

Yonik Seeley commented on SOLR-2500:


I don't have much time to look at this right now (and I don't really know the 
test), but I just tried running it directly from intellij and that failed also.

First, note that it tries to use something off the CWD... but then the core 
container is created under build/tests:
{code}
INFO: pwd: /opt/code/lusolr/.
2011-05-19 20.52.58 org.apache.solr.core.CoreContainer 
INFO: New CoreContainer 746169063
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/opt/code/lusolr/solr/build/tests/solr/shared/'
2011-05-19 20.52.58 org.apache.solr.core.SolrResourceLoader 
{code}

This causes a problem when it comes time to delete later:
{code}
WARNING:  WARNING: best effort to remove /opt/code/lusolr/solr/shared/data 
FAILED !
{code}

Of course the weird thing is that tearDown() only tries to delete the data 
directory and not the whole solr home... this seems incorrect?
That would lead to leaving around an old solr.xml file (since it's outside the 
data directory) and could cause issues the next time the test is run.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036334#comment-13036334
 ] 

Michael McCandless commented on SOLR-2500:
--

Fix works for me!!

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036332#comment-13036332
 ] 

Robert Muir commented on SOLR-2500:
---

Its also worth mentioning this patch won't help eclipse at all, its only a 
workaround for ant at the moment.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources

2011-05-19 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-2371:
--

Fix Version/s: (was: 3.2)

> Add a min() function query, upgrade max() function query to take two value 
> sources
> --
>
> Key: SOLR-2371
> URL: https://issues.apache.org/jira/browse/SOLR-2371
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: SOLR-2371.patch
>
>
> There doesn't appear to be a min() function.  Also, max() only allows a value 
> source and a constant b/c it is from before we had more flexible parsing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Moving towards Lucene 4.0

2011-05-19 Thread Earwin Burrfoot

On Thu, May 19, 2011 at 21:44, Chris Hostetter  wrote:
>
> : I think we should focus on everything that's *infrastructure* in 4.0, so
> : that we can develop additional features in subsequent 4.x releases. If we
> : end up releasing 4.0 just to discover many things will need to wait to 5.0,
> : it'll be a big loss.
>
> the catch with that approach (i'm speaking generally here, not with any of
> these particular lucene examples in mind) is that it's hard to know that
> the infrastructure really makes sense until you've built a bunch of stuff
> on it -- i think Josh Bloch has a paper where he says that you shouldn't
> publish an API abstraction until you've built at least 3 *real*
> (ie: not just toy or example) implementations of that API.
>
> it would be really easy to say "the infrastructure for X, Y, and Z is all
> in 4.0, features that leverage this infra will start coming in 4.1" and
> then discover on the way to 4.1 that we botched the APIs.

How do I express my profound love for these words, while remaining chaste? : )

> what does this mean concretely for the specific "big ticket" changes that
> we've got on trunk? ... i dunno, just my word of caution.
>
> : > we just started the discussion about Lucene 3.2 and releasing more
> : > often. Yet, I think we should also start planning for Lucene 4.0 soon.
> : > We have tons of stuff in trunk that people want to have and we can't
> : > just keep on talking about it - we need to push this out to our users.
>
> I agree, but i think the other approach we should take is to be more
> agressive about reviewing things that would be good candidates for
> backporting.
>
> If we feel like some feature has a well defined API on trunk, and it's got
> good tests, and people have been using it and filing bugs and helping to
> make it better then we should consider it a candidate for backporting --
> if the merge itself looks like it would be a huge pain in hte ass we don't
> *have* to backport, but we should at least look.
>
> That may not help for any of the "big ticket" infra changes discussed in
> this thread (where we know it really needs to wait for a major release)
> but it would definitely help with the "get features out to users faster"
> issue.
>
>
>
> -Hoss
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: ear...@gmail.com
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources

2011-05-19 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-2371.
---

Resolution: Fixed

> Add a min() function query, upgrade max() function query to take two value 
> sources
> --
>
> Key: SOLR-2371
> URL: https://issues.apache.org/jira/browse/SOLR-2371
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Trivial
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2371.patch
>
>
> There doesn't appear to be a min() function.  Also, max() only allows a value 
> source and a constant b/c it is from before we had more flexible parsing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2500:
--

Attachment: SOLR-2500.patch

The attached patch is a workaround for the issue for now, but we should fix the 
test to be "cleaner" as I don't like whats going on here.

Whats happening is the test changes its solr.xml configuration file, which is 
in build/tests/solr/shared/solr.xml. The next time you run the tests, it wont 
copy over this file because it has a newer time.

In my opinion the test should really make its own private home so it won't 
meddle with other tests or have problems like this (we can fix the test to do 
this), but this is a simple intermediate fix if you guys don't mind testing it.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, SOLR-2500.patch, 
> solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036331#comment-13036331
 ] 

Doron Cohen commented on LUCENE-3123:
-

I fact in 3x this is not reproducible with same seed (expected as Robert once 
explained) and I was not able to reproduce it with no seed, tried with 
-Dtest.iter=100 as well (though I am not sure, would a new seed be created in 
each iteration? Need to verify this...)
Anyhow in 3x the test passes also after svn up with this fix.
So I think this can be resolved...

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-1143) Return partial results when a connection to a shard is refused

2011-05-19 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-1143:
-

Assignee: (was: Grant Ingersoll)

> Return partial results when a connection to a shard is refused
> --
>
> Key: SOLR-1143
> URL: https://issues.apache.org/jira/browse/SOLR-1143
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Nicolas Dessaigne
> Fix For: 3.2
>
> Attachments: SOLR-1143-2.patch, SOLR-1143-3.patch, SOLR-1143.patch
>
>
> If any shard is down in a distributed search, a ConnectException it thrown.
> Here's a little patch that change this behaviour: if we can't connect to a 
> shard (ConnectException), we get partial results from the active shards. As 
> for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502), we 
> set the parameter "partialResults" at true.
> This patch also adresses a problem expressed in the mailing list about a year 
> ago 
> (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html)
> We have a use case that needs this behaviour and we would like to know your 
> thougths about such a behaviour? Should it be the default behaviour for 
> distributed search?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot

On Thu, May 19, 2011 at 20:43, Michael McCandless
 wrote:
> On Thu, May 19, 2011 at 12:35 PM, Jason Rutherglen
>  wrote:
>>> And I do agree there are times when mmap is appropriate, eg if query
>>> latency is unimportant to you, but it's not a panacea and it comes
>>> with serious downsides
>>
>> Do we have a benchmark of ByteBuffer vs. byte[]'s in RAM?
>
> I don't know of a straight up comparison...
I did compare MMapDir vs RAMDir variant a couple of years ago.
Searches slowed down a teeny-weeny little bit. GC times went down
noticeably. For me it was a big win.

Whatever Mike might say, mmap is great for latency-conscious applications : )

If someone tries to create artificial benchmark for byte[] VS
ByteBuffer, I'd recommend going through Lucene's abstraction layer.
If you simply read/write in a loop, JIT will optimize away boundary
checks for byte[] in some cases. This didn't ever happen to *Buffer
family for me.

>> There's also RAM based SSDs whose performance could be comparable with
>> well, RAM.
>
> True, though it's through layers of abstraction designed originally
> for serving files off of spinning magnets :)
>
>> Also, with our heap based field caches, the first sorted
>> search requires that they be loaded into RAM.  Then we don't unload
>> them until the reader is closed?  With MMap the unloading would happen
>> automatically?
>
> True, but really if the app knows it won't need that FC entry for a
> long time (ie, long enough to make it worth unloading/reloading) then
> it should really unload it.  MMap would still have to write all those
> pages to disk...
>
> DocValues actually makes this a lot cheaper because loading DocValues
> is much (like ~100 X from Simon's testing) faster than populating
> FieldCache since FieldCache must do all the uninverting.
>
> Mike
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: ear...@gmail.com
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036322#comment-13036322
 ] 

Doron Cohen commented on LUCENE-3123:
-

Yes, thanks, now it passes (trunk) - with this seed as well quite a few times 
without specifying a seed. 
I'll now verify on 3x.

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036319#comment-13036319
 ] 

Robert Muir commented on SOLR-2500:
---

OK, i think you might be right... TestSolrProperties is the one that just 
failed for me.

I'll look into this test now (though I'm still confused about 
TestSolrCoreProperties but i'll let that be)

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Moving towards Lucene 4.0

2011-05-19 Thread Chris Hostetter


: I think we should focus on everything that's *infrastructure* in 4.0, so
: that we can develop additional features in subsequent 4.x releases. If we
: end up releasing 4.0 just to discover many things will need to wait to 5.0,
: it'll be a big loss.

the catch with that approach (i'm speaking generally here, not with any of 
these particular lucene examples in mind) is that it's hard to know that 
the infrastructure really makes sense until you've built a bunch of stuff 
on it -- i think Josh Bloch has a paper where he says that you shouldn't 
publish an API abstraction until you've built at least 3 *real* 
(ie: not just toy or example) implementations of that API.

it would be really easy to say "the infrastructure for X, Y, and Z is all 
in 4.0, features that leverage this infra will start coming in 4.1" and 
then discover on the way to 4.1 that we botched the APIs.

what does this mean concretely for the specific "big ticket" changes that 
we've got on trunk? ... i dunno, just my word of caution.

: > we just started the discussion about Lucene 3.2 and releasing more
: > often. Yet, I think we should also start planning for Lucene 4.0 soon.
: > We have tons of stuff in trunk that people want to have and we can't
: > just keep on talking about it - we need to push this out to our users.

I agree, but i think the other approach we should take is to be more 
agressive about reviewing things that would be good candidates for 
backporting.

If we feel like some feature has a well defined API on trunk, and it's got 
good tests, and people have been using it and filing bugs and helping to 
make it better then we should consider it a candidate for backporting -- 
if the merge itself looks like it would be a huge pain in hte ass we don't 
*have* to backport, but we should at least look.

That may not help for any of the "big ticket" infra changes discussed in 
this thread (where we know it really needs to wait for a major release)
but it would definitely help with the "get features out to users faster" 
issue.



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036318#comment-13036318
 ] 

Michael McCandless commented on SOLR-2500:
--

For me, it's TestSolrProperties that reliably fails it's it's been run before.  
Ie, it passes on first run after "ant clean" but then fails thereafter.

TestSolrCoreProperties seems to run fine.

(Fedora 13).

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-2531.
---

Resolution: Fixed

Committed revision 1125011.

Thanks for reviewing Yonik.

> remove some per-term waste in SimpleFacets
> --
>
> Key: SOLR-2531
> URL: https://issues.apache.org/jira/browse/SOLR-2531
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: SOLR-2531.patch
>
>
> While looking at SOLR-2530,
> Seems like in the 'use filter cache' case of SimpleFacets we:
> 1. convert the bytes from utf8-utf16
> 2. create a string from the utf16
> 3. create a Term object from the string
> doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036308#comment-13036308
 ] 

Michael McCandless commented on LUCENE-3123:


I dropped it from 100 to 50 segs.  Can you test if that works in your env Doron?

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3123) TestIndexWriter.testBackgroundOptimize fails with too many open files

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036307#comment-13036307
 ] 

Michael McCandless commented on LUCENE-3123:


Does that repro line reproduce the failure for you Doron?  It's odd because 
that test doesn't make that many fields... oh I see it makes a 100 segment 
index. I'll drop that to 50...

The nightly build also hits too-many-open-files every so often, I suspect 
because our random-per-field-codec is making too many codecs... I wonder if we 
should throttle it?  Ie if it accumulates too many codecs, to start sharing 
them b/w fields?

> TestIndexWriter.testBackgroundOptimize fails with too many open files
> -
>
> Key: LUCENE-3123
> URL: https://issues.apache.org/jira/browse/LUCENE-3123
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
> Environment: Linux 2.6.32-31-generic i386/Sun Microsystems Inc. 
> 1.6.0_20 (32-bit)/cpus=1,threads=2
>Reporter: Doron Cohen
>
> Recreate with this line:
> ant test -Dtestcase=TestIndexWriter -Dtestmethod=testBackgroundOptimize 
> -Dtests.seed=-3981504507637360146:51354004663342240
> Might be related to LUCENE-2873 ?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036300#comment-13036300
 ] 

Doron Cohen commented on SOLR-2500:
---

Oops just noticed I was testing all this time TestSolrProperties and not 
TestSolrCoreProperties, and, because the error message was the same as in the 
issue description *"No such core: core0"* I was sure that this is the same 
test... Now this is confusing...

Hmmm.. the original exception reported above is 
[junit] at 
org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)

So perhaps I was working on the correct bug after all and just the JIRA issue 
title is inaccurate?
Or I need to call it a day... :)

Anyhow, TestSolrProperties consistently behaves as I described here, while 
TestSolrCoreProperties consistently passes (when ran in standalone mode).

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036299#comment-13036299
 ] 

Yonik Seeley commented on SOLR-2531:


Yep - the minDF (to use the filter cache) defaults to 0.

> remove some per-term waste in SimpleFacets
> --
>
> Key: SOLR-2531
> URL: https://issues.apache.org/jira/browse/SOLR-2531
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: SOLR-2531.patch
>
>
> While looking at SOLR-2530,
> Seems like in the 'use filter cache' case of SimpleFacets we:
> 1. convert the bytes from utf8-utf16
> 2. create a string from the utf16
> 3. create a Term object from the string
> doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036297#comment-13036297
 ] 

Robert Muir commented on SOLR-2531:
---

do the tests cover this >= minDF case well? 

If so, I'll commit.

> remove some per-term waste in SimpleFacets
> --
>
> Key: SOLR-2531
> URL: https://issues.apache.org/jira/browse/SOLR-2531
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: SOLR-2531.patch
>
>
> While looking at SOLR-2530,
> Seems like in the 'use filter cache' case of SimpleFacets we:
> 1. convert the bytes from utf8-utf16
> 2. create a string from the utf16
> 3. create a Term object from the string
> doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036294#comment-13036294
 ] 

Yonik Seeley commented on SOLR-2531:


Yep - looks like dead code.

> remove some per-term waste in SimpleFacets
> --
>
> Key: SOLR-2531
> URL: https://issues.apache.org/jira/browse/SOLR-2531
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: SOLR-2531.patch
>
>
> While looking at SOLR-2530,
> Seems like in the 'use filter cache' case of SimpleFacets we:
> 1. convert the bytes from utf8-utf16
> 2. create a string from the utf16
> 3. create a Term object from the string
> doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1964) Double-check and fix Maven POM dependencies

2011-05-19 Thread Steven Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved SOLR-1964.
---

   Resolution: Duplicate
Fix Version/s: 3.1
   3.2

See LUCENE-2657.

> Double-check and fix Maven POM dependencies
> ---
>
> Key: SOLR-1964
> URL: https://issues.apache.org/jira/browse/SOLR-1964
> Project: Solr
>  Issue Type: Bug
>  Components: Build
>Reporter: Erik Hatcher
>Priority: Minor
> Fix For: 3.2, 4.0, 3.1
>
>
> To include the velocity deps in solr-core-pom.xml.template, something like 
> this:
> 
>velocity
>velocity
>1.6.1
> 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-1877) Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036289#comment-13036289
 ] 

Michael McCandless commented on LUCENE-1877:


OK.  I would strongly recommend using the lock stress test 
(LockStressTest/LockVerifyServer) in Lucene to verify whichever locking you're 
trying is in fact working properly.

> Use NativeFSLockFactory as default for new API (direct ctors & FSDir.open)
> --
>
> Key: LUCENE-1877
> URL: https://issues.apache.org/jira/browse/LUCENE-1877
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: general/javadocs
>Reporter: Mark Miller
>Assignee: Uwe Schindler
> Fix For: 2.9
>
> Attachments: LUCENE-1877.patch, LUCENE-1877.patch, LUCENE-1877.patch, 
> LUCENE-1877.patch
>
>
> A user requested we add a note in IndexWriter alerting the availability of 
> NativeFSLockFactory (allowing you to avoid retaining locks on abnormal jvm 
> exit). Seems reasonable to me - we want users to be able to easily stumble 
> upon this class. The below code looks like a good spot to add a note - could 
> also improve whats there a bit - opening an IndexWriter does not necessarily 
> create a lock file - that would depend on the LockFactory used.
> {code}  Opening an IndexWriter creates a lock file for the 
> directory in use. Trying to open
>   another IndexWriter on the same directory will lead to a
>   {@link LockObtainFailedException}. The {@link LockObtainFailedException}
>   is also thrown if an IndexReader on the same directory is used to delete 
> documents
>   from the index.{code}
> Anyone remember why NativeFSLockFactory is not the default over 
> SimpleFSLockFactory?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3108) Land DocValues on trunk

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036290#comment-13036290
 ] 

Yonik Seeley commented on LUCENE-3108:
--

bq. ValueSource? (conflicts w/ FQs though) Though, maybe we can just refer to 
it as DocValues.Source, then it's clear?

Both ValueSource and DocValues have long been used by function queries.

> Land DocValues on trunk
> ---
>
> Key: LUCENE-3108
> URL: https://issues.apache.org/jira/browse/LUCENE-3108
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/index, core/search, core/store
>Affects Versions: CSF branch, 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3108.patch
>
>
> Its time to move another feature from branch to trunk. I want to start this 
> process now while still a couple of issues remain on the branch. Currently I 
> am down to a single nocommit (javadocs on DocValues.java) and a couple of 
> testing TODOs (explicit multithreaded tests and unoptimized with deletions) 
> but I think those are not worth separate issues so we can resolve them as we 
> go. 
> The already created issues (LUCENE-3075 and LUCENE-3074) should not block 
> this process here IMO, we can fix them once we are on trunk. 
> Here is a quick feature overview of what has been implemented:
>  * DocValues implementations for Ints (based on PackedInts), Float 32 / 64, 
> Bytes (fixed / variable size each in sorted, straight and deref variations)
>  * Integration into Flex-API, Codec provides a 
> PerDocConsumer->DocValuesConsumer (write) / PerDocValues->DocValues (read) 
>  * By-Default enabled in all codecs except of PreFlex
>  * Follows other flex-API patterns like non-segment reader throw UOE forcing 
> MultiPerDocValues if on DirReader etc.
>  * Integration into IndexWriter, FieldInfos etc.
>  * Random-testing enabled via RandomIW - injecting random DocValues into 
> documents
>  * Basic checks in CheckIndex (which runs after each test)
>  * FieldComparator for int and float variants (Sorting, currently directly 
> integrated into SortField, this might go into a separate DocValuesSortField 
> eventually)
>  * Extended TestSort for DocValues
>  * RAM-Resident random access API plus on-disk DocValuesEnum (currently only 
> sequential access) -> Source.java / DocValuesEnum.java
>  * Extensible Cache implementation for RAM-Resident DocValues (by-default 
> loaded into RAM only once and freed once IR is closed) -> SourceCache.java
>  
> PS: Currently the RAM resident API is named Source (Source.java) which seems 
> too generic. I think we should rename it into RamDocValues or something like 
> that, suggestion welcome!   
> Any comments, questions (rants :)) are very much appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036288#comment-13036288
 ] 

Doron Cohen commented on SOLR-2500:
---

FWIW, also the first clean run would fail if test's tearDown() is modified like 
this:

{noformat}
-persistedFile.delete();
+assertTrue("could not delete "+persistedFile, persistedFile.delete());
{noformat}

For some reason it fails to remove that file - in both Linux and Windows.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Michael McCandless

On Thu, May 19, 2011 at 12:35 PM, Jason Rutherglen
 wrote:
>> And I do agree there are times when mmap is appropriate, eg if query
>> latency is unimportant to you, but it's not a panacea and it comes
>> with serious downsides
>
> Do we have a benchmark of ByteBuffer vs. byte[]'s in RAM?

I don't know of a straight up comparison...

> There's also RAM based SSDs whose performance could be comparable with
> well, RAM.

True, though it's through layers of abstraction designed originally
for serving files off of spinning magnets :)

> Also, with our heap based field caches, the first sorted
> search requires that they be loaded into RAM.  Then we don't unload
> them until the reader is closed?  With MMap the unloading would happen
> automatically?

True, but really if the app knows it won't need that FC entry for a
long time (ie, long enough to make it worth unloading/reloading) then
it should really unload it.  MMap would still have to write all those
pages to disk...

DocValues actually makes this a lot cheaper because loading DocValues
is much (like ~100 X from Simon's testing) faster than populating
FieldCache since FieldCache must do all the uninverting.

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2531:
--

Attachment: SOLR-2531.patch

> remove some per-term waste in SimpleFacets
> --
>
> Key: SOLR-2531
> URL: https://issues.apache.org/jira/browse/SOLR-2531
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: SOLR-2531.patch
>
>
> While looking at SOLR-2530,
> Seems like in the 'use filter cache' case of SimpleFacets we:
> 1. convert the bytes from utf8-utf16
> 2. create a string from the utf16
> 3. create a Term object from the string
> doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2531) remove some per-term waste in SimpleFacets

2011-05-19 Thread Robert Muir (JIRA)

remove some per-term waste in SimpleFacets
--

 Key: SOLR-2531
 URL: https://issues.apache.org/jira/browse/SOLR-2531
 Project: Solr
  Issue Type: Task
Reporter: Robert Muir
 Attachments: SOLR-2531.patch

While looking at SOLR-2530,

Seems like in the 'use filter cache' case of SimpleFacets we:
1. convert the bytes from utf8-utf16
2. create a string from the utf16
3. create a Term object from the string

doesn't seem like any of this is necessary, as the Term is unused...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-docvalues-branch - Build # 1145 - Failure

2011-05-19 Thread Apache Jenkins Server

Build: 
https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-docvalues-branch/1145/

No tests ran.

Build Log (for compile errors):
[...truncated 55 lines...]
clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build

clean:

clean:
 [echo] Building analyzers-common...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/common
 [echo] Building analyzers-icu...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/icu
 [echo] Building analyzers-phonetic...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/phonetic
 [echo] Building analyzers-smartcn...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/smartcn
 [echo] Building analyzers-stempel...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/stempel
 [echo] Building benchmark...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/benchmark/build
 [echo] Building grouping...

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/grouping/build

clean-contrib:

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/analysis-extras/build
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/analysis-extras/lucene-libs

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/clustering/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/dataimporthandler/target

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/extraction/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/uima/build

clean:
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/build

BUILD SUCCESSFUL
Total time: 3 seconds
+ cd 
/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene
+ JAVA_HOME=/home/hudson/tools/java/latest1.5 
/home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
Buildfile: build.xml

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] Compiling 536 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   public boolean onOrAfter(Version other) {
[javac]  ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/index/PerFieldCodecWrapper.java:309:
 cannot find symbol
[javac] symbol  : constructor IOException(java.lang.Exception)
[javac] location: class java.io.IOException
[javac] err = new IOException(ioe);
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getColumn();
[javac]   ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:41:
 warning: [dep-ann] deprecated name isnt annotated with @Deprecated
[javac]   int getLine();
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac]

[jira] [Commented] (LUCENE-3108) Land DocValues on trunk

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036284#comment-13036284
 ] 

Michael McCandless commented on LUCENE-3108:


{quote}
bq. How come codecID changed from String to int on the branch?

due to DocValues I need to compare the ID to certain fields to see for
what field I stored and need to open docValues. I always had to parse
the given string which is kind of odd. I think its more natural to
have the same datatype on FieldInfo, SegmentCodecs and eventually in
the Codec#files() method. Making a string out of it is way simpler /
less risky than parsing IMO.
{quote}

OK that sounds great.

{quote}
bq. Can SortField somehow detect whether the needed field was stored in FC vs DV

This is tricky though. You can have a DV field that is indexed too so its hard 
to tell if we can reliably do it. If we can't make it reliable I think we 
should not do it at all.
{quote}

It is tricky... but, eg, when someone does SortField("title",
SortField.STRING), which cache (DV or FC) should we populate?

{quote}
bq. Should we rename oal.index.values.Type -> .ValueType?

agreed. I also think we should rename Source but I don't have a good name yet. 
Any idea?
{quote}

ValueSource?  (conflicts w/ FQs though) Though, maybe we can just
refer to it as DocValues.Source, then it's clear?

{quote}
bq. Since we dynamically reserve a value to mean "unset", does that mean there 
are some datasets we cannot index?

Again, tricky! The quick answer is yes, but we can't do that anyway since I 
have not normalize the range to be 0 based since PackedInts doesn't allow 
negative values. so the range we can store is (2^63) -1. So essentially with 
the current impl we can store (2^63)-2 and the max value is Long#MAX_VALUE-1. 
Currently there is no assert for this which is needed I think but to get around 
this we need to have a different impl I think or do I miss something?
{quote}

OK, but I think if we make a "straight longs" impl (ie no packed ints at all) 
then we can handle all long values?  But in that case we'd require the app to 
pick a sentinel to mean "unset"?


> Land DocValues on trunk
> ---
>
> Key: LUCENE-3108
> URL: https://issues.apache.org/jira/browse/LUCENE-3108
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/index, core/search, core/store
>Affects Versions: CSF branch, 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3108.patch
>
>
> Its time to move another feature from branch to trunk. I want to start this 
> process now while still a couple of issues remain on the branch. Currently I 
> am down to a single nocommit (javadocs on DocValues.java) and a couple of 
> testing TODOs (explicit multithreaded tests and unoptimized with deletions) 
> but I think those are not worth separate issues so we can resolve them as we 
> go. 
> The already created issues (LUCENE-3075 and LUCENE-3074) should not block 
> this process here IMO, we can fix them once we are on trunk. 
> Here is a quick feature overview of what has been implemented:
>  * DocValues implementations for Ints (based on PackedInts), Float 32 / 64, 
> Bytes (fixed / variable size each in sorted, straight and deref variations)
>  * Integration into Flex-API, Codec provides a 
> PerDocConsumer->DocValuesConsumer (write) / PerDocValues->DocValues (read) 
>  * By-Default enabled in all codecs except of PreFlex
>  * Follows other flex-API patterns like non-segment reader throw UOE forcing 
> MultiPerDocValues if on DirReader etc.
>  * Integration into IndexWriter, FieldInfos etc.
>  * Random-testing enabled via RandomIW - injecting random DocValues into 
> documents
>  * Basic checks in CheckIndex (which runs after each test)
>  * FieldComparator for int and float variants (Sorting, currently directly 
> integrated into SortField, this might go into a separate DocValuesSortField 
> eventually)
>  * Extended TestSort for DocValues
>  * RAM-Resident random access API plus on-disk DocValuesEnum (currently only 
> sequential access) -> Source.java / DocValuesEnum.java
>  * Extensible Cache implementation for RAM-Resident DocValues (by-default 
> loaded into RAM only once and freed once IR is closed) -> SourceCache.java
>  
> PS: Currently the RAM resident API is named Source (Source.java) which seems 
> too generic. I think we should rename it into RamDocValues or something like 
> that, suggestion welcome!   
> Any comments, questions (rants :)) are very much appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional comm

Re: FST and FieldCache?

2011-05-19 Thread Jason Rutherglen

> And I do agree there are times when mmap is appropriate, eg if query
> latency is unimportant to you, but it's not a panacea and it comes
> with serious downsides

Do we have a benchmark of ByteBuffer vs. byte[]'s in RAM?

There's also RAM based SSDs whose performance could be comparable with
well, RAM.  Also, with our heap based field caches, the first sorted
search requires that they be loaded into RAM.  Then we don't unload
them until the reader is closed?  With MMap the unloading would happen
automatically?

On Thu, May 19, 2011 at 8:59 AM, Michael McCandless
 wrote:
> On Thu, May 19, 2011 at 10:09 AM, Jason Rutherglen
>  wrote:
>>> When
>>> you mmap them you let the OS decide when to swap stuff out which mean
>>> you pick up potentially high query latency waiting for these pages to
>>> swap back in
>>
>> Right, however if one is using lets say SSDs, and the query time is
>> less important, then MMap'ing would be fine.  Also it prevents deadly
>> OOMs in favor of basic 'slowness' of the query.  If there is no
>> performance degradation I think MMap'ing is a great option.  A common
>> use case is an index that's far too large for a given server will
>> simply not work today, whereas with MMap'ed field caches the query
>> would complete, just extremely slowly.  If the user wishes to improve
>> performance it's easy enough to add more hardware.
>
> Well, be careful: if you just don't have enough memory to accomodate
> all the RAM data structures Lucene needs... you're gonna be in trouble
> with mmap too.  True, you won't hit OOMEs anymore, but instead you'll
> be in a swap fest and your app is nearly unusable.
>
> SSDs, while orders of magnitude faster than spinning magnets, are
> still orders of magnitude slower than RAM.
>
> But, yes, they obviously help substantially.  It's a one-way door...
> you'll never go back once you've switched to SSDs.
>
> And I do agree there are times when mmap is appropriate, eg if query
> latency is unimportant to you, but it's not a panacea and it comes
> with serious downsides.
>
> I wish I could have the opposite of mmap from Java -- the ability to
> pin the pages that hold important data structures.
>
> Mike
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Need help building JCC on windows

2011-05-19 Thread Bill Janssen

Hi, Baseer.  Not sure what's the issue with your build, but here's a bit
of bash script which I use to build JCC with mingw on Windows:

echo "-- jcc --"
export PATH="$PATH:${javahome}/jre/bin/client"
echo "PATH is $PATH"
cd ../pylucene-3.0.*/jcc
# note that this patch still works for 3.0.1/3.0.2
patch -p0 < ${patchesdir}/jcc-2.9-mingw-PATCH
export JCC_ARGSEP=";"
export JCC_JDK="$WINSTYLEJAVAHOME"
export JCC_CFLAGS="-fno-strict-aliasing;-Wno-write-strings"
export JCC_LFLAGS="-L${WINSTYLEJAVAHOME}\\lib;-ljvm"
export 
JCC_INCLUDES="${WINSTYLEJAVAHOME}\\include;${WINSTYLEJAVAHOME}\\include\\win32"
export JCC_JAVAC="${WINSTYLEJAVAHOME}\\bin\\javac.exe"
${python} setup.py build --compiler=mingw32 install 
--single-version-externally-managed --root /c/ --prefix="${distdir}"
if [ -f jcc/jcc.lib ]; then
  cp -p jcc/jcc.lib "${sitepackages}/jcc/jcc.lib"
fi
# for 3.0.2 compiled with MinGW GCC 4.x and "--shared", we also need two
# GCC libraries
if [ -f /mingw/bin/libstdc++-6.dll ]; then
  install -m 555 /mingw/bin/libstdc++-6.dll "${distdir}/bin/"
  echo "copied libstdc++-6.dll"
fi
if [ -f /mingw/bin/libgcc_s_dw2-1.dll ]; then
  install -m 555 /mingw/bin/libgcc_s_dw2-1.dll "${distdir}/bin/"
  echo "copied libgcc_s_dw2-1.dll"
fi


The patch that I apply is this:

*** setup.py2009-10-28 15:24:16.0 -0700
--- setup.py2010-03-29 22:08:56.0 -0700
***
*** 262,268 
  elif platform == 'win32':
  jcclib = 'jcc%s.lib' %(debug and '_d' or '')
  kwds["extra_link_args"] = \
! lflags + ["/IMPLIB:%s" %(os.path.join('jcc', jcclib))]
  package_data.append(jcclib)
  else:
  kwds["extra_link_args"] = lflags
--- 262,268 
  elif platform == 'win32':
  jcclib = 'jcc%s.lib' %(debug and '_d' or '')
  kwds["extra_link_args"] = \
! lflags + ["-Wl,--out-implib,%s" %(os.path.join('jcc', 
jcclib))]
  package_data.append(jcclib)
  else:
  kwds["extra_link_args"] = lflags

It makes sure to build the jcc.lib file so that I can use it in "shared" mode.

Bill

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036270#comment-13036270
 ] 

Michael McCandless commented on SOLR-2524:
--

{quote}
bq. Was this a simple TermQuery

No a MatchDocAllQuery (:)
{quote}

Ahh OK then that makes sense -- MatchAllDocsQuery is a might fast query to 
execute ;)  So the work done to cache it is going to be slower.

> Adding grouping to Solr 3x
> --
>
> Key: SOLR-2524
> URL: https://issues.apache.org/jira/browse/SOLR-2524
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.2
>Reporter: Martijn van Groningen
>Assignee: Michael McCandless
> Attachments: SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more 
> information.
> I think it would be nice if we expose this functionality also to the Solr 
> users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the 
> functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping 
> by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is 
> acceptable. I have it more or less running here. It supports the response 
> format and request parameters (expect: group.query and group.func) described 
> in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many 
> people are using grouping as patch now and this would help them a lot. Any 
> thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Michael McCandless

On Thu, May 19, 2011 at 10:09 AM, Jason Rutherglen
 wrote:
>> When
>> you mmap them you let the OS decide when to swap stuff out which mean
>> you pick up potentially high query latency waiting for these pages to
>> swap back in
>
> Right, however if one is using lets say SSDs, and the query time is
> less important, then MMap'ing would be fine.  Also it prevents deadly
> OOMs in favor of basic 'slowness' of the query.  If there is no
> performance degradation I think MMap'ing is a great option.  A common
> use case is an index that's far too large for a given server will
> simply not work today, whereas with MMap'ed field caches the query
> would complete, just extremely slowly.  If the user wishes to improve
> performance it's easy enough to add more hardware.

Well, be careful: if you just don't have enough memory to accomodate
all the RAM data structures Lucene needs... you're gonna be in trouble
with mmap too.  True, you won't hit OOMEs anymore, but instead you'll
be in a swap fest and your app is nearly unusable.

SSDs, while orders of magnitude faster than spinning magnets, are
still orders of magnitude slower than RAM.

But, yes, they obviously help substantially.  It's a one-way door...
you'll never go back once you've switched to SSDs.

And I do agree there are times when mmap is appropriate, eg if query
latency is unimportant to you, but it's not a panacea and it comes
with serious downsides.

I wish I could have the opposite of mmap from Java -- the ability to
pin the pages that hold important data structures.

Mike

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread David Smiley (@MITRE.org)

Wow, 17 replies to my email overnight! This is clearly an interesting topic
to folks.

Hi Dawid.
Sadly, I won't be at Lucene Revolution next week. That's where all the cool
kids will be; I'll be home and be square. I made it to O'Reilly Strata in
February (a great conference) and I'll be presenting at Basis's "Open Source
Search Conference" (government customer focused) mid-June.  I've used up my
conference budget for this fiscal year.

Yes, the use-case here is a unique integer reference to a String that can be
looked up fairly quickly, whereas the set of all strings are in a compressed
data structure that won't change after its built. A bonus benefit would be
that this integer is a sortable substitute for the String.  Your observation
of this integer being a perfect-hash is astute.

I wonder if Lucene could store this FST on-disk for the bytes in a segment
instead of what it's doing now? Read-time construction would be super-fast,
though for multi-segment indexes, I suppose they'd need to be merged.

I expect that this use-case would be particularly useful for cases when you
know that the set of strings tends to have a great deal of prefixes in
common, such as when EdgeNGramming (applications: query-complete,
hierarchical faceting, prefix/tree based geospatial indexing).

~ David Smiley

Dawid Weiss wrote:
> 
> Hi David,
> 
>> but with less memory.  As I understand it, FSTs are a highly compressed
>> representation of a set of Strings (among other possibilities).  The
> 
> Yep. Not only, but this is one of the use cases. Will you be at Lucene
> Revolution next week? I'll be talking about it there.
> 
>> representation of a set of Strings (among other possibilities).  The
>> fieldCache would need to point to an FST entry (an "arc"?) using
>> something
>> small, say an integer.  Is there a way to point to an FST entry with an
>> integer, and then somehow with relative efficiency construct the String
>> from
>> the arcs to get there?
> 
> Correct me if my understanding is wrong: you'd like to assign a unique
> integer to each String and then retrieve it by this integer (something
> like a
> Map)? This would be something called perfect
> hashing
> and this can be done on top of an automaton (fairly easily). I assume
> the data structure is immutable once constructed and does not change
> too often, right?
> 
> Dawid
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 

-
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/FST-and-FieldCache-tp2960030p2961954.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2500:
--

Attachment: SOLR-2500.patch

I guess the real question is: why doesn't the test work if rewritten like this?

Bug in TestHarness?
Bug in CoreContainer/properties loading functionality itself?

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: SOLR-2500.patch, solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated SOLR-2500:
--

Attachment: solr-after-1st-run.xml
solr-clean.xml

solr.xml files from trunk/bin/solr/shared:
- clean - with which the test passes.
- after-1st-run - with which it fails.

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: solr-after-1st-run.xml, solr-clean.xml
>
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036243#comment-13036243
 ] 

Robert Muir commented on SOLR-2500:
---

{quote}
In Eclipse, after cleaning the project the test passes, and then start failing 
in all successive runs. 
{quote}

FYI This is the behavior I've noticed when running the test from Ant also... a 
'clean' seems to workaround the issue...

> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2500) TestSolrCoreProperties sometimes fails with "no such core: core0"

2011-05-19 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036242#comment-13036242
 ] 

Doron Cohen commented on SOLR-2500:
---

>From Eclipse (XP), passed at 1st attempt, failed at the 2nd!

I am not familiar with this part of the code so it would be too much work to 
track it all the way myself, but I think I can now provide sufficient 
information for solving it.

In Eclipse, after cleaning the project the test passes, and then start failing 
in all successive runs. 
So I assume when you run it isolated you also do clean, which covers Eclipse's 
clean (and more). 

I tracked the content of the cleaned relevant dir before and after the test - 
it is (trunk/)bin/solr - there's only one file that differs between the runs - 
this is bin/solr/shared/solr.xml.

Not sure if this is a bug in the test not cleaning after itself or a bug in the 
code that reads the configuration...

I'll attach here the two file so that you can compare them.


> TestSolrCoreProperties sometimes fails with "no such core: core0"
> -
>
> Key: SOLR-2500
> URL: https://issues.apache.org/jira/browse/SOLR-2500
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
>
> [junit] Testsuite: 
> org.apache.solr.client.solrj.embedded.TestSolrProperties
> [junit] Testcase: 
> testProperties(org.apache.solr.client.solrj.embedded.TestSolrProperties): 
> Caused an ERROR
> [junit] No such core: core0
> [junit] org.apache.solr.common.SolrException: No such core: core0
> [junit] at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:118)
> [junit] at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> [junit] at 
> org.apache.solr.client.solrj.embedded.TestSolrProperties.testProperties(TestSolrProperties.java:128)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036240#comment-13036240
 ] 

Yonik Seeley commented on SOLR-2530:


Minor nit: renaming bigTerm to UnicodeUtil.BIG_UTF8_TERM is a bit misleading 
since it's not UTF8 at all.

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread David Smiley (@MITRE.org)


Michael McCandless-2 wrote:
> 
> I think a more productive area of exploration (to reduce RAM usage)
> would be to make a StringFieldComparator that doesn't need full access
> to all terms data, ie, operates per segment yet only does a "few" ord
> lookups when merging the results across segments.  If "few" is small
> enough we can just use us the seek-by-ord from the terms dict to do
> them.  This would be a huge RAM reduction because we could then sort
> by string fields (eg "title" field) without needing all term bytes
> randomly accessible.
> 
> Mike
> 

Yes!  I don't want to put all my titles into RAM just to sort documents by
them when I know Lucene has indexed the titles in sorted order on disk
already.  Of course the devil is in the details.

~ David Smiley

-
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/FST-and-FieldCache-tp2960030p2961687.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036224#comment-13036224
 ] 

Robert Muir commented on SOLR-2530:
---

My recommendation: add CharsRef. We already have BytesRef and IntsRef...

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1942) Ability to select codec per field

2011-05-19 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-1942:
--

Attachment: SOLR-1942.patch

Updated patch with Simon's previous suggestions.

A few more things I saw that I'm not sure I like:
* the CodecProvider syntax in the test config is cool, but i'm not sure this 
should be done in SolrCore? I think if you want to have a CP that loads up 
codecs by classname like this, it should be done in a CodecProviderFactory (you 
know parsing arguments however it wants).
* I think its confusing how the SchemaCodecProvider answers to codec requests 
in 3 ways, 1. from the 'delegate' in SolrConfig, 2. from the schema, and 3. 
from the default codecProvider. I think if you try to use this, its easy to get 
yourself in a situation where solrconfig conflicts with the schema. I also 
don't think we need to bother with the 'defaultCP', in other words if you 
specify a custom codec provider, this is the only one that is used.

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036219#comment-13036219
 ] 

Yonik Seeley commented on SOLR-2530:


There are some efficiency losses here:
- A reusable CharArr allows one to avoid extra object creation.  See 
TermsComponent which can update a CharArr and then compare it against a pattern 
w/o having to create a String object.
- We should not replace the previous toString with BytesRef.utf8String... it's 
much slower, esp for small strings like will be common here.

So rather than just removing ByteUtils.UTF8toUTF16, how about moving it to 
BytesRef and use it in BytesRTef.utf8String?

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Created] (SOLR-2529) DIH update trouble with sql field name "pk"

2011-05-19 Thread Erick Erickson

Could you identify what you think the problem is?

Erick

On Thu, May 19, 2011 at 9:45 AM, Thomas Gambier (JIRA)  wrote:
> DIH update trouble with sql field name "pk"
> ---
>
>                 Key: SOLR-2529
>                 URL: https://issues.apache.org/jira/browse/SOLR-2529
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 3.1, 3.2
>         Environment: Debian Lenny, JRE 6
>            Reporter: Thomas Gambier
>            Priority: Blocker
>
>
> We are unable to use the DIH when database columnName primary key is named 
> "pk".
>
> The reported solr error is :
> "deltaQuery has no column to resolve to declared primary key pk='pk'"
>
> We have made some investigations and found that the DIH have a mistake when 
> it's looking for the primary key between row's columns list.
>
> private String findMatchingPkColumn(String pk, Map row) {
> if (row.containsKey(pk))
>  throw new IllegalArgumentException(
>    String.format("deltaQuery returned a row with null for primary key %s", 
> pk));
> String resolvedPk = null;
> for (String columnName : row.keySet()) {
>  if (columnName.endsWith("." + pk) || pk.endsWith("." + columnName)) {
>    if (resolvedPk != null)
>      throw new IllegalArgumentException(
>        String.format(
>          "deltaQuery has more than one column (%s and %s) that might resolve 
> to declared primary key pk='%s'",
>          resolvedPk, columnName, pk));
>      resolvedPk = columnName;
>    }
>  }
>  if (resolvedPk == null)
>    throw new IllegalArgumentException(
>      String.format("deltaQuery has no column to resolve to declared primary 
> key pk='%s'", pk));
>  LOG.info(String.format("Resolving deltaQuery column '%s' to match entity's 
> declared pk '%s'", resolvedPk, pk));
>  return resolvedPk;
> }
>
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-19 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated SOLR-2530:
--

Attachment: SOLR-2530.patch

here is a patch

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-19 Thread Simon Willnauer (JIRA)

Remove Noggit CharArr from FieldType


 Key: SOLR-2530
 URL: https://issues.apache.org/jira/browse/SOLR-2530
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 4.0


FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
also spreads into ByteUtils. The uses of this method area all convert to String 
which makes this extra reference and the dependency unnecessary. I refactored 
it to simply return string and removed ByteUtils entirely. The only leftover 
from BytesUtils is a constant, i moved that one to Lucenes UnicodeUtils. I will 
upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Jason Rutherglen

> When
> you mmap them you let the OS decide when to swap stuff out which mean
> you pick up potentially high query latency waiting for these pages to
> swap back in

Right, however if one is using lets say SSDs, and the query time is
less important, then MMap'ing would be fine.  Also it prevents deadly
OOMs in favor of basic 'slowness' of the query.  If there is no
performance degradation I think MMap'ing is a great option.  A common
use case is an index that's far too large for a given server will
simply not work today, whereas with MMap'ed field caches the query
would complete, just extremely slowly.  If the user wishes to improve
performance it's easy enough to add more hardware.

On Thu, May 19, 2011 at 6:40 AM, Michael McCandless
 wrote:
> On Thu, May 19, 2011 at 9:22 AM, Jason Rutherglen
>  wrote:
>
>>> maybe thats because we have one huge monolithic implementation
>>
>> Doesn't the DocValues branch solve this?
>
> Hopefully DocValues will replace FieldCache over time; maybe some day
> we can deprecate & remove FieldCache.
>
> But we still have work to do there, I believe; eg we don't have
> comparators for all types (on the docvalues branch) yet.
>
>> Also, instead of trying to implement clever ways of compressing
>> strings in the field cache, which probably won't bare fruit, I'd
>> prefer to look at [eventually] MMap'ing (using DV) the field caches to
>> avoid the loading and heap costs, which are signifcant.  I'm not sure
>> if we can easily MMap packed ints and the shared byte[], though it
>> seems fairly doable?
>
> In fact, the packed ints and the byte[] packing of terms data is very
> much amenable/necessary for using MMap, far moreso than the separate
> objects we had before.
>
> I agree we should make an mmap option, though I would generally
> recommend against apps using mmap for these caches.  We load these
> caches so that we'll have fast random access to potentially a great
> many documents during collection of one query (eg for sorting).  When
> you mmap them you let the OS decide when to swap stuff out which mean
> you pick up potentially high query latency waiting for these pages to
> swap back in.  Various other data structures in Lucene needs this fast
> random access (norms, del docs, terms index) and that's why we put
> them in RAM.  I do agree for all else (the lrge postings), MMap is
> great.
>
> Of course the OS swaps out process RAM anyway, so... it's kinda moot
> (unless you've fixed your OS to not do this, which I always do!).
>
> I think a more productive area of exploration (to reduce RAM usage)
> would be to make a StringFieldComparator that doesn't need full access
> to all terms data, ie, operates per segment yet only does a "few" ord
> lookups when merging the results across segments.  If "few" is small
> enough we can just use us the seek-by-ord from the terms dict to do
> them.  This would be a huge RAM reduction because we could then sort
> by string fields (eg "title" field) without needing all term bytes
> randomly accessible.
>
> Mike
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-19 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036189#comment-13036189
 ] 

Simon Willnauer commented on SOLR-1942:
---

bq. OK I opened LUCENE-3124 for this

+1 thanks! good point!

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [Lucene.Net] [jira] [Commented] (LUCENENET-412) Replacing ArrayLists, Hashtables etc. with appropriate Generics.

2011-05-19 Thread Digy

Not just this version, Lucene.Net 2.9.4 also can read (in theory) the index 
created in 3.0.3. But I haven't tested it myself.
DIGY.

-Original Message-
From: Alexander Bauer [mailto:a...@familie-bauer.info] 
Sent: Thursday, May 19, 2011 8:37 AM
To: lucene-net-...@lucene.apache.org
Subject: Re: [Lucene.Net] [jira] [Commented] (LUCENENET-412) Replacing 
ArrayLists, Hashtables etc. with appropriate Generics.


Can i use this version with an existing index based on lucene.Java 3.0.3 ?

Alex


Am 19.05.2011 00:20, schrieb Digy (JIRA):
>  [ 
> https://issues.apache.org/jira/browse/LUCENENET-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035795#comment-13035795
>  ]
>
> Digy commented on LUCENENET-412:
> 
>
> Hi All,
>
> Lucene.Net 2.9.4g is almost ready for testing&  feedbacks.
>
> While injecting generics&  making some clean up in code, I tried to be close 
> to lucene 3.0.3 as much as possible.
> Therefore it's position is somewhere between lucene.Java 2.9.4&  3.0.3
>
> DIGY
>
>
> PS: For those who might want to try this version:
> It won't probably be a drop-in replacement since there are a few API changes 
> like
> - StopAnalyzer(List  stopWords)
> - Query.ExtractTerms(ICollection)
> - TopDocs.*TotalHits*, TopDocs.*ScoreDocs*
> and some removed methods/classes like
> - Filter.Bits
> - JustCompileSearch
> - Contrib/Similarity.Net
>
>
>
>
>> Replacing ArrayLists, Hashtables etc. with appropriate Generics.
>> 
>>
>>  Key: LUCENENET-412
>>  URL: https://issues.apache.org/jira/browse/LUCENENET-412
>>  Project: Lucene.Net
>>   Issue Type: Improvement
>> Affects Versions: Lucene.Net 2.9.4
>> Reporter: Digy
>> Priority: Minor
>>  Fix For: Lucene.Net 2.9.4
>>
>>  Attachments: IEquatable for Query&Subclasses.patch, 
>> LUCENENET-412.patch, lucene_2.9.4g_exceptions_fix
>>
>>
>> This will move Lucene.Net.2.9.4 closer to lucene.3.0.3 and allow some 
>> performance gains.
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

[jira] [Created] (SOLR-2529) DIH update trouble with sql field name "pk"

2011-05-19 Thread Thomas Gambier (JIRA)

DIH update trouble with sql field name "pk"
---

 Key: SOLR-2529
 URL: https://issues.apache.org/jira/browse/SOLR-2529
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.1, 3.2
 Environment: Debian Lenny, JRE 6
Reporter: Thomas Gambier
Priority: Blocker


We are unable to use the DIH when database columnName primary key is named "pk".

The reported solr error is :
"deltaQuery has no column to resolve to declared primary key pk='pk'"

We have made some investigations and found that the DIH have a mistake when 
it's looking for the primary key between row's columns list.

private String findMatchingPkColumn(String pk, Map row) {
if (row.containsKey(pk))
  throw new IllegalArgumentException(
String.format("deltaQuery returned a row with null for primary key %s", 
pk));
String resolvedPk = null;
for (String columnName : row.keySet()) {
  if (columnName.endsWith("." + pk) || pk.endsWith("." + columnName)) {
if (resolvedPk != null)
  throw new IllegalArgumentException(
String.format(
  "deltaQuery has more than one column (%s and %s) that might resolve 
to declared primary key pk='%s'",
  resolvedPk, columnName, pk));
  resolvedPk = columnName;
}
  }
  if (resolvedPk == null)
throw new IllegalArgumentException(
  String.format("deltaQuery has no column to resolve to declared primary 
key pk='%s'", pk));
  LOG.info(String.format("Resolving deltaQuery column '%s' to match entity's 
declared pk '%s'", resolvedPk, pk));
  return resolvedPk;
}


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036184#comment-13036184
 ] 

Robert Muir commented on SOLR-1942:
---

OK I opened LUCENE-3124 for this

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3124) review CodecProvider/CoreCodecProvider/SchemaCodecProvider hierarchy

2011-05-19 Thread Robert Muir (JIRA)

review CodecProvider/CoreCodecProvider/SchemaCodecProvider hierarchy


 Key: LUCENE-3124
 URL: https://issues.apache.org/jira/browse/LUCENE-3124
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir


As mentioned on SOLR-1942, I think we should revisit the CodecProvider 
hierarchy.

Its a little bit confusing how the class itself isn't really "abstract" but is 
really an overridable implementation.

One idea would be to make CodecProvider an interface, with Lucene using a 
simple hashmap-backed impl and Solr using the schema-backed impl. This would be 
in line with how SimilarityProvider was done.

It would also be good to review all the methods in CodecProvider and see if we 
can minimize the interface...


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Michael McCandless

On Thu, May 19, 2011 at 9:22 AM, Jason Rutherglen
 wrote:

>> maybe thats because we have one huge monolithic implementation
>
> Doesn't the DocValues branch solve this?

Hopefully DocValues will replace FieldCache over time; maybe some day
we can deprecate & remove FieldCache.

But we still have work to do there, I believe; eg we don't have
comparators for all types (on the docvalues branch) yet.

> Also, instead of trying to implement clever ways of compressing
> strings in the field cache, which probably won't bare fruit, I'd
> prefer to look at [eventually] MMap'ing (using DV) the field caches to
> avoid the loading and heap costs, which are signifcant.  I'm not sure
> if we can easily MMap packed ints and the shared byte[], though it
> seems fairly doable?

In fact, the packed ints and the byte[] packing of terms data is very
much amenable/necessary for using MMap, far moreso than the separate
objects we had before.

I agree we should make an mmap option, though I would generally
recommend against apps using mmap for these caches.  We load these
caches so that we'll have fast random access to potentially a great
many documents during collection of one query (eg for sorting).  When
you mmap them you let the OS decide when to swap stuff out which mean
you pick up potentially high query latency waiting for these pages to
swap back in.  Various other data structures in Lucene needs this fast
random access (norms, del docs, terms index) and that's why we put
them in RAM.  I do agree for all else (the lrge postings), MMap is
great.

Of course the OS swaps out process RAM anyway, so... it's kinda moot
(unless you've fixed your OS to not do this, which I always do!).

I think a more productive area of exploration (to reduce RAM usage)
would be to make a StringFieldComparator that doesn't need full access
to all terms data, ie, operates per segment yet only does a "few" ord
lookups when merging the results across segments.  If "few" is small
enough we can just use us the seek-by-ord from the terms dict to do
them.  This would be a huge RAM reduction because we could then sort
by string fields (eg "title" field) without needing all term bytes
randomly accessible.

Mike

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Jason Rutherglen

> This is more about compressing strings in TermsIndex, I think.

Ah, because they're sorted.  I think if the string lookup cost
degrades then it's not worth it?  That's something that needs to be
tested in the MMap case as well, eg, are ByteBuffers somehow slowing
down everything by a factor of 10%?

On Thu, May 19, 2011 at 6:30 AM, Earwin Burrfoot  wrote:
> This is more about compressing strings in TermsIndex, I think.
> And ability to use said TermsIndex directly in some cases that
> required FieldCache before. (Maybe FC is still needed, but it can be
> degraded to docId->ord map, storing actual strings in TI).
> This yields fat space savings when we, eg,  need to both lookup on a
> field and build facets out of it.
>
> mmap is cool :)  What I want to see is a FST-based TermsDict that is
> simply mmaped into memory, without building intermediate indexes, like
> Lucene does now.
> And docvalues are orthogonal to that, no?
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036183#comment-13036183
 ] 

Robert Muir commented on LUCENE-2981:
-

ok there does seem to be some consensus now, thanks guys.

Ryan, can you elaborate on your concerns (reason for your -0)? Maybe there is 
something we can do to address them.

> Review and potentially remove unused/unsupported Contribs
> -
>
> Key: LUCENE-2981
> URL: https://issues.apache.org/jira/browse/LUCENE-2981
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-2981.patch
>
>
> Some of our contribs appear to be lacking for development/support or are 
> missing tests.  We should review whether they are even pertinent these days 
> and potentially deprecate and remove them.
> One of the things we did in Mahout when bringing in Colt code was to mark all 
> code that didn't have tests as @deprecated and then we removed the 
> deprecation once tests were added.  Those that didn't get tests added over 
> about a 6 mos. period of time were removed.
> I would suggest taking a hard look at:
> ant
> db
> lucli
> swing
> (spatial should be gutted to some extent and moved to modules)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036182#comment-13036182
 ] 

Michael McCandless commented on SOLR-1942:
--

I agree the CodecProvider/CoreCodecProvider is a scary potential delegation 
trap... Robert can you open a new issue?  I agree it should not block this one.

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot

This is more about compressing strings in TermsIndex, I think.
And ability to use said TermsIndex directly in some cases that
required FieldCache before. (Maybe FC is still needed, but it can be
degraded to docId->ord map, storing actual strings in TI).
This yields fat space savings when we, eg,  need to both lookup on a
field and build facets out of it.

mmap is cool :)  What I want to see is a FST-based TermsDict that is
simply mmaped into memory, without building intermediate indexes, like
Lucene does now.
And docvalues are orthogonal to that, no?

On Thu, May 19, 2011 at 17:22, Jason Rutherglen
 wrote:
>> maybe thats because we have one huge monolithic implementation
>
> Doesn't the DocValues branch solve this?
>
> Also, instead of trying to implement clever ways of compressing
> strings in the field cache, which probably won't bare fruit, I'd
> prefer to look at [eventually] MMap'ing (using DV) the field caches to
> avoid the loading and heap costs, which are signifcant.  I'm not sure
> if we can easily MMap packed ints and the shared byte[], though it
> seems fairly doable?
>
> On Thu, May 19, 2011 at 6:05 AM, Robert Muir  wrote:
>> 2011/5/19 Michael McCandless :
>>
>>> Of course, for
>>> certain apps that perf hit is justified, so probably we should make
>>> this an option when populating field cache (ie, in-memory storage
>>> option of using an FST vs using packed ints/byte[]).
>>>
>>
>> or should we actually try to have different fieldcacheimpls?
>>
>> I see all these missions to refactor the thing, which always fail.
>>
>> maybe thats because we have one huge monolithic implementation.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: ear...@gmail.com
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036177#comment-13036177
 ] 

Robert Muir commented on SOLR-1942:
---

Hi Simon, 

after reviewing the patch I have some concerns about CodecProvider. I think its 
a little bit confusing how the CodecProvider/CoreCodecProvider hierarchy works 
today, and a bit dangerous how we delegate over this class.

For example, if we add a new method to CodecProvider, we need to be sure we add 
the 'delegation' here every time or stuff will start acting strange.

For this reason, I wonder if CodecProvider should be an interface: the simple 
implementation we have in lucene is a hashmap, but Solr uses fieldType lookup. 
This would parallel how SimilarityProvider works.

If we want to do this, I think we should open a separate issue... in fact I'm 
not even sure it should block this issue since in my opinion its a shame you 
cannot manipulate codecs in Solr right now... but I just wanted to bring it up 
here.


> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 146 matches

Mail list logo