date:20110401

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6591/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: 
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1076)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1008)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:236)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:168)




Build Log (for compile errors):
[...truncated 9184 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-01 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-2378:
--

Description: 
Implement a subclass of Lookup based on finite state automata/ transducers 
(Lucene FST package). This issue is for implementing a relatively basic prefix 
matcher, we will handle infixes and other types of input matches gradually. 
Impl. phases:

- write a DFA based suggester effectively identical to ternary tree based 
solution right now,
- baseline benchmark against tern. tree (memory consumption, rebuilding speed, 
indexing speed; reuse Andrzej's benchmark code)
- modify DFA to encode term weights directly in the automaton (optimize for 
onlyMostPopular case)
- benchmark again
- add infix suggestion support with prefix matches boosted higher (?)
- benchmark again
- modify the tutorial on the wiki [http://wiki.apache.org/solr/Suggester]

  was:Implement a subclass of Lookup based on finite state automata/ 
transducers (Lucene FST package). This issue is for implementing a relatively 
basic prefix matcher, we will handle infixes and other types of input matches 
gradually.


 FST-based Lookup (suggestions) for prefix matches.
 --

 Key: SOLR-2378
 URL: https://issues.apache.org/jira/browse/SOLR-2378
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Reporter: Dawid Weiss
Assignee: Dawid Weiss
  Labels: lookup, prefix
 Fix For: 4.0


 Implement a subclass of Lookup based on finite state automata/ transducers 
 (Lucene FST package). This issue is for implementing a relatively basic 
 prefix matcher, we will handle infixes and other types of input matches 
 gradually. Impl. phases:
 - write a DFA based suggester effectively identical to ternary tree based 
 solution right now,
 - baseline benchmark against tern. tree (memory consumption, rebuilding 
 speed, indexing speed; reuse Andrzej's benchmark code)
 - modify DFA to encode term weights directly in the automaton (optimize for 
 onlyMostPopular case)
 - benchmark again
 - add infix suggestion support with prefix matches boosted higher (?)
 - benchmark again
 - modify the tutorial on the wiki [http://wiki.apache.org/solr/Suggester]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Does solr support secure enterprise search?

2011-04-01 Thread Upayavira

You should really be asking this on the user list
solr-u...@lucene.apache.org.

Solr does not provide any security features - you would be
expected to implement security within your application that you
put in front of Solr.

LucidWorks Enterprise (a commercial package based upon Solr) does
offer security features.

Upayavira

On Fri, 01 Apr 2011 16:37 +0800, michong900617
michong900...@xmu.edu.cn wrote:

Hello,
Does solr support secure enterprise search?
That's to say, person can only visit to the concerns of the
information within their authorities.
If I wanna meet the goal, what can I do?



Thanks for your help.
2011-04-01


  Best wishes



Zhenpeng Fang



   方 振鹏



   Dept. Software Engineering



Xiamen University
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source

Re: Does solr support secure enterprise search?

2011-04-01 Thread Gérard Dupont

The user list solr-u...@lucene.apache.org. may be the best place to ask. But
if I understand well there is multiple questions in your request:
- one is related to secure access and I would suggest using HTTPS
- user login and for this I suggest to use a portal that manage user logins
(out of SOLR scope) and that will integrate SOLR
- access restriction for each user. This is possible but not provided out of
the box with SolR.
That's my own answer but some others may provide more up to date
information.


2011/4/1 michong900617 michong900...@xmu.edu.cn

  Hello,
 Does solr support secure enterprise search?
 That's to say, person can only visit to the concerns of the information
 within their authorities.
 If I wanna meet the goal, what can I do?

 Thanks for your help.


 2011-04-01
 --
Best wishes

 Zhenpeng Fang

 方 振鹏

 Dept. Software Engineering

 Xiamen University





-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory

add(CharSequence) in automaton builder

2011-04-01 Thread Dawid Weiss

Mike, can you remember what ordering is required for
add(CharSequence)? I see it requires INPUT_TYPE.BYTE4

assert fst.getInputType() == FST.INPUT_TYPE.BYTE4;

but this would imply the order of full unicode codepoints on the
input? Is this what String comparators do by default (I doubt, but
wanted to check if you know first).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: add(CharSequence) in automaton builder

On Fri, Apr 1, 2011 at 7:58 AM, Dawid Weiss dawid.we...@gmail.com wrote:
 Mike, can you remember what ordering is required for
 add(CharSequence)? I see it requires INPUT_TYPE.BYTE4

 assert fst.getInputType() == FST.INPUT_TYPE.BYTE4;

 but this would imply the order of full unicode codepoints on the
 input? Is this what String comparators do by default (I doubt, but
 wanted to check if you know first).


(sorry not mike, but) you are right, String.compareTo() compares in
utf-16 order by default. this is not consistent with the order the FST
builder expects (utf8/utf32 order)

So if you are going to order the terms before passing them to Builder,
you should either use a utf-16-in-utf-8-order comparator* (or simply
use codePointAt and friends and compare those ints, probably
slower...)

different ways of impl'ing the comparator below:
* http://icu-project.org/docs/papers/utf16_code_point_order.html
* http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: add(CharSequence) in automaton builder

2011-04-01 Thread Dawid Weiss

 (sorry not mike, but) you are right, String.compareTo() compares in

He, he, thanks Robert. We have these anti-child-abuse commercials on
tv right now you never know who's on the other side... how
appropriate for this situation.

 utf-16 order by default. this is not consistent with the order the FST
 builder expects (utf8/utf32 order)

Yes, this is what I also figured out. The unicode code point order is
also impl. in BytesRef.getUTF8SortedAsUnicodeComparator, correct? For
what I need I'll use raw utf8 byte order, it doesn't matter as long as
it's consistent.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 6591 - Failure

2011-04-01 Thread Michael McCandless

I committed fix -- this was in the backwards tests...

Mike

http://blog.mikemccandless.com

On Fri, Apr 1, 2011 at 6:35 AM, Apache Hudson Server
hud...@hudson.apache.org wrote:
 Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6591/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange

 Error Message:
 null

 Stack Trace:
 junit.framework.AssertionFailedError:
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1076)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1008)
        at 
 org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:236)
        at 
 org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:168)




 Build Log (for compile errors):
 [...truncated 9184 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: add(CharSequence) in automaton builder

On Fri, Apr 1, 2011 at 8:25 AM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:

 Yes, this is what I also figured out. The unicode code point order is
 also impl. in BytesRef.getUTF8SortedAsUnicodeComparator, correct? For
 what I need I'll use raw utf8 byte order, it doesn't matter as long as
 it's consistent.


yes, if you are already working with bytes, definitely just stay with
binary order (utf8 and utf32 are the same order, its only
utf16/String/chars that are wackos)

sorry, since you were talking about the charsequence api to builder, i
assumed for a second you were working with chars/Strings, and forgot
about how this is confusingly mixed with, yet distinct from, the whole
BYTE1/BYTE4 selection in builder :)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Questions about 3.1.0 release, SVN and common-build.xml

Hi

I noticed that 3.1.0's tag in svn is
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1. Should it
not be http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1_0? At
least, that's what's specified under Publishing on ReleaseTodo wiki.

Also, the common-build.xml under the tag and the downloaded sources
specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found this: ...
and the default version in lucene/common-build.xml on the branch to X.Y
(remove the -SNAPSHOT suffix), so I guess 'SNAPSHOT' should have been
removed, but also version should be set to 3.1.0.

I apologize for finding these *after* the release has been created. I don't
think it's critical that we fix the common-build.xml, but perhaps update the
ReleaseTodo accordingly, so we do it right on 3.2? Can we rename the tag? Is
it critical?

Shai

Re: add(CharSequence) in automaton builder

2011-04-01 Thread Michael McCandless

On Fri, Apr 1, 2011 at 8:29 AM, Robert Muir rcm...@gmail.com wrote:

 sorry, since you were talking about the charsequence api to builder, i
 assumed for a second you were working with chars/Strings, and forgot
 about how this is confusingly mixed with, yet distinct from, the whole
 BYTE1/BYTE4 selection in builder :)

It IS really confusing!

Really, the Builder  FST need to be parameterized also on the input
type (it's already parameterized on the output type), but confronting
the required generics to accomplish this was. scary.

Mike

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks

2011-04-01 Thread Simon Willnauer (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014538#comment-13014538
]

Simon Willnauer commented on LUCENE-2573:
-

bq. Thanks, Simon, for running the benchmarks! Good results overall, even
though it's puzzling why flushing would be CPU intensive.
well during flush we are encoding lots of VInts thats making it cpu intensive.

I actually run the benchmark through a profiler and found out what the problem
was with my benchmarks.
When I indexed with DWPT my HDD was soo busy flushing segments concurrently
that the read performance suffered and my indexing threads blocked on the line
doc file where I read the records from. This explains the large amounts of
spikes towards 0 doc/sec. The profiler also showed that we are waiting on
ThreadState#lock() constantly with at least 3 threads. I changed the current
behavior of the threadpool to not clear the thread bindings when I replace a
DWPT for flushing an voila! we have comparable peak ingest rate.

!http://people.apache.org/~simonw/DocumentsWriterPerThread_dps_01.png!

Note the difference DWPT indexes the documents in 6 min 15 seconds!

!http://people.apache.org/~simonw/Trunk_dps_01.png!

Here we have 13 min 40 seconds! NICE!

!http://people.apache.org/~simonw/DocumentsWriterPerThread_flush_01.png!

Tiered flushing of DWPTs by RAM with low/high water marks
-

Key: LUCENE-2573
URL: https://issues.apache.org/jira/browse/LUCENE-2573
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Michael Busch
Assignee: Simon Willnauer
Priority: Minor
Fix For: Realtime Branch

Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch,
LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch,
LUCENE-2573.patch, LUCENE-2573.patch

Now that we have DocumentsWriterPerThreads we need to track total consumed
RAM across all DWPTs.
A flushing strategy idea that was discussed in LUCENE-2324 was to use a
tiered approach:
- Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM)
- Flush all DWPTs at a high water mark (e.g. at 110%)
- Use linear steps in between high and low watermark: E.g. when 5 DWPTs are
used, flush at 90%, 95%, 100%, 105% and 110%.
Should we allow the user to configure the low and high water mark values
explicitly using total values (e.g. low water mark at 120MB, high water mark
at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB()
config method and use something like 90% and 110% for the water marks?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

On Fri, Apr 1, 2011 at 8:31 AM, Shai Erera ser...@gmail.com wrote:
 Hi

 I noticed that 3.1.0's tag in svn is
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1. Should it
 not be http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1_0? At
 least, that's what's specified under Publishing on ReleaseTodo wiki.


Yes, I did this intentionally to try to discourage a 3.1.1. Is it
really necessary to have confusing 3-part bugfix releases when
branch_3x itself is a stable branch?! Shouldnt we just work on 3.2
now?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

On Fri, Apr 1, 2011 at 8:42 AM, Robert Muir rcm...@gmail.com wrote:
 On Fri, Apr 1, 2011 at 8:31 AM, Shai Erera ser...@gmail.com wrote:
 Hi

 I noticed that 3.1.0's tag in svn is
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1. Should it
 not be http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1_0? At
 least, that's what's specified under Publishing on ReleaseTodo wiki.


 Yes, I did this intentionally to try to discourage a 3.1.1. Is it
 really necessary to have confusing 3-part bugfix releases when
 branch_3x itself is a stable branch?! Shouldnt we just work on 3.2
 now?


(sorry i refer to the branch, not the tag here, but i think it still
makes sense).

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

2011-04-01 Thread Grant Ingersoll


On Apr 1, 2011, at 8:31 AM, Shai Erera wrote:

 Hi
 
 I noticed that 3.1.0's tag in svn is 
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1. Should it 
 not be http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1_0? At 
 least, that's what's specified under Publishing on ReleaseTodo wiki.

Yeah, we can move it.

Re: Questions about 3.1.0 release, SVN and common-build.xml

The branch is ok -- 3_1 branch is intended for 3.1.x future releases indeed.

If we can commit to releasing 3.2 instead of 3.1.1 in case only bug fixes
are present, then I'm ok with it. We'd also need to commit, in general, to
release more often. So if we decide to release say every 3 months, then 3.2
can include all the bug fixes for 3.1.

If that's the case (and I support it wholeheartedly), why create a branch
for 3.1 at all - we could just tag branches_3x?

Also, the release artifacts are named 3.1.0, suggesting there will be a
3.1.1 -- hence why I wrote this email. But again, +1 on:
* Not releasing 3.1.1, but instead 3.2
* Not branching 3x, but instead only tag it
* Name the artifacts of future releases x.y only.

Shai

On Fri, Apr 1, 2011 at 2:43 PM, Robert Muir rcm...@gmail.com wrote:

 On Fri, Apr 1, 2011 at 8:42 AM, Robert Muir rcm...@gmail.com wrote:
  On Fri, Apr 1, 2011 at 8:31 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I noticed that 3.1.0's tag in svn is
  http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1. Should
 it
  not be
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1_0? At
  least, that's what's specified under Publishing on ReleaseTodo wiki.
 
 
  Yes, I did this intentionally to try to discourage a 3.1.1. Is it
  really necessary to have confusing 3-part bugfix releases when
  branch_3x itself is a stable branch?! Shouldnt we just work on 3.2
  now?
 

 (sorry i refer to the branch, not the tag here, but i think it still
 makes sense).

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

Re: add(CharSequence) in automaton builder

2011-04-01 Thread Dawid Weiss

 sorry, since you were talking about the charsequence api to builder, i
 assumed for a second you were working with chars/Strings, and forgot
 about how this is confusingly mixed with, yet distinct from, the whole
 BYTE1/BYTE4 selection in builder :)

I am working with strings because that's what the Lookup API is
providing... which I think should change, but it's something for
another round of patches. The BYTE1/BYTE4 is confusing and I believe
at least some sort of documentation should be added there to clarify
what it's for and how
it should be used.  Again -- something to clarify as part of another task.

I should have that Lookup impl. ready tomorrow, had to reiterate over
certain things first and it took me longer than expected.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

On Fri, Apr 1, 2011 at 8:49 AM, Shai Erera ser...@gmail.com wrote:
 The branch is ok -- 3_1 branch is intended for 3.1.x future releases indeed.

 If we can commit to releasing 3.2 instead of 3.1.1 in case only bug fixes
 are present, then I'm ok with it. We'd also need to commit, in general, to
 release more often. So if we decide to release say every 3 months, then 3.2
 can include all the bug fixes for 3.1.


I don't think we have to commit to anything explicitly? but maybe we
should see how things go?

Releasing lucene and solr is a heavy-duty job and why make
bugfix-only branches (this is a lot of merging and stuff required for
committers) when we can issue releases with bugfixes and also a couple
stable improvements?

Personally, i decided today to stop putting bugs in my code in the
first place :)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2959) [GSoC] Implementing State of the Art Ranking for Lucene

2011-04-01 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014547#comment-13014547
]

Robert Muir commented on LUCENE-2959:
-

{quote}
One thing that is not clear for me is why these limitations would not be a
problem for BM25. As I see it, the difference between the two methods is that
BM25 simply computes tfs, idfs and document length from the whole document –
which, according to what you said, is not available Lucene. That's why I
figured that a variant of BM25F would actually be more straightforward to
implement.
{quote}

A variant sounds really interesting? I think you know better than me here, I
just looked at the original paper and thought to myself that to implement this
by the book might not be feasible for a while.

{quote}
Robert, would you be so kind to have a look at my proposal? It can be found at
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/davidnemeskey/1.
It's basically the same as what I sent to the mailing list. I wrote that I
want to implement BM25, BM25F and DFR (the framework, I meant with one or two
smoothing models), as well as to convert the original scoring to the new
framework. In light of the thread here, I guess it would be better to modify
these goals, perhaps by:

deleting the conversion part?
committing myself to BM25/BM25F only?
explicitly stating that I want a higher level API based on the low-level one?
{quote}

I think you can decide what you want to do? Obviously I would love to see all
of it done :)

But its your choice, I could see you going a couple different ways:
* closer to your original proposal, you could still develop a flexible scoring
API on top of Similarity. Hey, all I did was move stuff from Scorer to
Similarity really, which does give flexibility, but its probably not what an IR
researcher would want (its low-level and confusing). So you could make a
SimpleSimilarity or EasySimilarity or something thats presents a much
simpler API (something closer to what terrier/indri present) on top of this,
for easily implementing ranking functions? I think this would be extremely
valuable long-term: who cares if we have a low-level flexible scoring API that
only speed demons like, but IR practitioners find confusing and hideous?
Someone who is trying to experiment with an enhancement to relevance likely
doesn't care if their TREC run takes 30 seconds instead of 20 seconds if the
API is really easy and they aren't wasting time fighting with lucene? If you go
this route, you could implement BM25, DFR, etc as you suggested as examples to
how to use this API, and there would be more of a focus on API quality and
simplicity instead of performance.
* or alternatively, you could refine your proposal to implement a really
production strength version of one of these scoring systems on top of the
low-level API, that would ideally have competitive
performance/documentation/etc with Lucene's default scoring today. If you
decide to do this, then yes, I would definitely suggest picking only one,
because I think its a ton of work as I listed above, and I think there would be
more focus on practical things (some probably being nuances of lucene) and
performance.

[GSoC] Implementing State of the Art Ranking for Lucene
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Questions about 3.1.0 release, SVN and common-build.xml

2011-04-01 Thread Steven A Rowe

Hi Shai,

On 4/1/2011 at 8:32 AM, Shai Erera wrote:
 Also, the common-build.xml under the tag and the downloaded sources
 specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found
 this: ... and the default version in lucene/common-build.xml on
 the branch to X.Y (remove the -SNAPSHOT suffix), so I guess
 'SNAPSHOT' should have been removed, but also version should be set
 to 3.1.0.

I'm pretty sure the ReleaseTodo page is wrong on this.

Building from a source distribution should *not* produce artifacts that have 
the same version in their names as the binary release.  We don't want 
same-version-but-different binary artifacts being accidentally produced.  
There's nothing stopping people from changing this themselves, of course, so 
leaving the pre-release version name, including the -SNAPSHOT suffix, in the 
source release is just a passive defense against this kind of mistake.

I'll change the ReleaseTodo page if there are no objections.

Steve

[Lucene.Net] Incubator PMC/Board report for April 2011 (lucene-net-...@lucene.apache.org)

2011-04-01 Thread no-reply

Dear Lucene.NET Developers,

This email was sent by an automated system on behalf of the Apache Incubator 
PMC.
It is an initial reminder to give you plenty of time to prepare your quarterly
board report.

The board meeting is scheduled for  Wed, 20 April 2011, 10 am Pacific. The 
report 
for your podling will form a part of the Incubator PMC report. The Incubator 
PMC 
requires your report to be submitted one week before the board meeting, to 
allow 
sufficient time for review.

Please submit your report with sufficient time to allow the incubator PMC, and 
subsequently board members to review and digest. Again, the very latest you 
should submit your report is one week prior to the board meeting.

Thanks,

The Apache Incubator PMC

Submitting your Report
--

Your report should contain the following:

 * Your project name
 * A brief description of your project, which assumes no knowledge of the 
project
   or necessarily of its field
 * A list of the three most important issues to address in the move towards 
   graduation.
 * Any issues that the Incubator PMC or ASF Board might wish/need to be aware of
 * How has the community developed since the last report
 * How has the project developed since the last report.
 
This should be appended to the Incubator Wiki page at:

  http://wiki.apache.org/incubator/April2011

Note: This manually populated. You may need to wait a little before this page is
  created from a template.

Mentors
---
Mentors should review reports for their project(s) and sign them off on the 
Incubator wiki page. Signing off reports shows that you are following the 
project - projects that are not signed may raise alarms for the Incubator PMC.

Incubator PMC

Re: Unsupported encoding GB18030

2011-04-01 Thread Yonik Seeley

On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl jan@cominvent.com wrote:
 Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23

 When trying to post example\exampledocs\gb18030-example.xml using post.jar I 
 get this error:
 % java -jar post.jar gb18030-example.xml
 jar gb18030-example.xml
 SimplePostTool: version 1.3
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file gb18030-example.xml
 SimplePostTool: FATAL: Solr returned an error #400 Unsupported encoding: 
 GB18030lap

 From the stack it is caused by com.ctc.wstx.exc.WstxIOException: Unsupported 
 encoding: GB18030

 The same works on my MacBook with Java1.6.0_24

Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
don't have XP around any longer to see if that's the factor somehow.


-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Unsupported encoding GB18030

On Fri, Apr 1, 2011 at 10:00 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl jan@cominvent.com wrote:
 Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23

 When trying to post example\exampledocs\gb18030-example.xml using post.jar I 
 get this error:
 % java -jar post.jar gb18030-example.xml
 jar gb18030-example.xml
 SimplePostTool: version 1.3
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file gb18030-example.xml
 SimplePostTool: FATAL: Solr returned an error #400 Unsupported encoding: 
 GB18030lap

 From the stack it is caused by com.ctc.wstx.exc.WstxIOException: Unsupported 
 encoding: GB18030

 The same works on my MacBook with Java1.6.0_24

 Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
 don't have XP around any longer to see if that's the factor somehow.


Its worth mentioning, there is no guarantee the JRE will support
GB18030 encoding.

There are only 6 charsets guaranteed to exist:
http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Unsupported encoding GB18030

2011-04-01 Thread Uwe Schindler

 On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl jan@cominvent.com wrote:
  Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23
 
  When trying to post example\exampledocs\gb18030-example.xml using
 post.jar I get this error:
  % java -jar post.jar gb18030-example.xml
  jar gb18030-example.xml
  SimplePostTool: version 1.3
  SimplePostTool: POSTing files to http://localhost:8983/solr/update..
  SimplePostTool: POSTing file gb18030-example.xml
  SimplePostTool: FATAL: Solr returned an error #400 Unsupported encoding:
 GB18030lap
 
  From the stack it is caused by com.ctc.wstx.exc.WstxIOException:
 Unsupported encoding: GB18030
 
  The same works on my MacBook with Java1.6.0_24
 
 Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
 don't have XP around any longer to see if that's the factor somehow.

It seems that this JVM used on Windows does not support the particular
encoding. This is not Solr's fault, maybe it's some stripped down foreign
JDK like IBM's or whatever. But even Sun only guarantees some encodings to
be present in any JVM, but GB18030 is for sure very optional. As far as I
remember in early JDK days, there were extra eastern JDKs around, that had
extra charsets, maybe thats still the case for Win XP?

Uwe


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Unsupported encoding GB18030

2011-04-01 Thread Yonik Seeley

On Fri, Apr 1, 2011 at 10:07 AM, Robert Muir rcm...@gmail.com wrote:
 On Fri, Apr 1, 2011 at 10:00 AM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl jan@cominvent.com wrote:
 Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23

 When trying to post example\exampledocs\gb18030-example.xml using post.jar 
 I get this error:
 % java -jar post.jar gb18030-example.xml
 jar gb18030-example.xml
 SimplePostTool: version 1.3
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file gb18030-example.xml
 SimplePostTool: FATAL: Solr returned an error #400 Unsupported encoding: 
 GB18030lap

 From the stack it is caused by com.ctc.wstx.exc.WstxIOException: 
 Unsupported encoding: GB18030

 The same works on my MacBook with Java1.6.0_24

 Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
 don't have XP around any longer to see if that's the factor somehow.


 Its worth mentioning, there is no guarantee the JRE will support
 GB18030 encoding.

 There are only 6 charsets guaranteed to exist:
 http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html

Indexing *.xml is a very common thing for new users to do.
If this is likely to fail for enough users, we should move, remove, or
at least change the filename to
something like gb18030-example.xml.gb18030 so it won't get picked up
by accident.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #76: POMs out of sync

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/76/

No tests ran.

Build Log (for compile errors):
[...truncated 9507 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2959) [GSoC] Implementing State of the Art Ranking for Lucene

2011-04-01 Thread David Mark Nemeskey (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014619#comment-13014619
 ] 

David Mark Nemeskey commented on LUCENE-2959:
-

{quote}
I think you can decide what you want to do?
{quote}
Fair enough. :) I guess I'll stick with my original proposal then, though I 
might change a few things here and there; maybe change the focus from 
flexibility (as it seems to be already underway) to simplicity.

 [GSoC] Implementing State of the Art Ranking for Lucene
 ---

 Key: LUCENE-2959
 URL: https://issues.apache.org/jira/browse/LUCENE-2959
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Examples, Javadocs, Query/Scoring
Reporter: David Mark Nemeskey
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Attachments: LUCENE-2959_mockdfr.patch, implementation_plan.pdf, 
 proposal.pdf


 Lucene employs the Vector Space Model (VSM) to rank documents, which compares
 unfavorably to state of the art algorithms, such as BM25. Moreover, the 
 architecture is
 tailored specically to VSM, which makes the addition of new ranking functions 
 a non-
 trivial task.
 This project aims to bring state of the art ranking methods to Lucene and to 
 implement a
 query architecture with pluggable ranking functions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2061) Generate jar containing test classes.


 [ 
https://issues.apache.org/jira/browse/SOLR-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated SOLR-2061:
--

Attachment: SOLR-2061.patch

This patch includes a new Test Framework Javadoc link from the Solr website's 
index.html.

Committing shortly.

 Generate jar containing test classes.
 -

 Key: SOLR-2061
 URL: https://issues.apache.org/jira/browse/SOLR-2061
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1
Reporter: Drew Farris
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch, 
 SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch


 Follow-on to LUCENE-2609 for the solr build -- it would be useful to generate 
 and deploy a jar contaiing the test classes so other projects could write 
 unit tests using the framework in Solr. 
 This may take care of SOLR-717 as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

Ok, we can keep SNAPSHOT. I was just thinking it'll be nice if the tag
sets the right version, for convenience. It's not so hard to do.

BTW, I don't build the code, just run jar-src so I can attach the
source to the jars for debugging purposes. If we had packaged them
already (not that I propose that we do that), I wouldn't be
downloading the source at all.

Thanks,
Shai

On Friday, April 1, 2011, Uwe Schindler u...@thetaphi.de wrote:
 +1.

 In all previous releases we were leaving the -dev in the common-build.xml, 
 it's simply now -SNAPSHOT. Whenever somebody compiles the code himself, there 
 might be changes in it so the reproduced JAR files are not identical to the 
 released ones.

 This was the same for all releases before (at least since 2.9.0 where we had 
 the discussion, too).

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Steven A Rowe [mailto:sar...@syr.edu]
 Sent: Friday, April 01, 2011 3:19 PM
 To: dev@lucene.apache.org
 Subject: RE: Questions about 3.1.0 release, SVN and common-build.xml

 Hi Shai,

 On 4/1/2011 at 8:32 AM, Shai Erera wrote:
  Also, the common-build.xml under the tag and the downloaded sources
  specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found
  this: ... and the default version in lucene/common-build.xml on the
  branch to X.Y (remove the -SNAPSHOT suffix), so I guess 'SNAPSHOT'
  should have been removed, but also version should be set to 3.1.0.

 I'm pretty sure the ReleaseTodo page is wrong on this.

 Building from a source distribution should *not* produce artifacts that have
 the same version in their names as the binary release.  We don't want same-
 version-but-different binary artifacts being accidentally produced.  There's
 nothing stopping people from changing this themselves, of course, so leaving
 the pre-release version name, including the -SNAPSHOT suffix, in the
 source release is just a passive defense against this kind of mistake.

 I'll change the ReleaseTodo page if there are no objections.

 Steve



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Unsupported encoding GB18030

2011-04-01 Thread Uwe Schindler

Hi Yonik,

I started my virtual box with fresh windows xp snapshot. Downloaded JDK
1.6.0_24 and Solr 3.1.0. Started solr and then java -jar post.jar *.xml -
success.

You should before we start to fix something that's not an issue ask this
person which JDK exactly he uses and where he downloaded it. Is it maybe not
an Oracle one? (this GB encoding is very common - if a JVM does not support
it (it must not) it can only be some western-european one like I mentioned
in my mail).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Friday, April 01, 2011 4:21 PM
 To: dev@lucene.apache.org
 Cc: Robert Muir
 Subject: Re: Unsupported encoding GB18030
 
 On Fri, Apr 1, 2011 at 10:07 AM, Robert Muir rcm...@gmail.com wrote:
  On Fri, Apr 1, 2011 at 10:00 AM, Yonik Seeley
  yo...@lucidimagination.com wrote:
  On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl jan@cominvent.com
 wrote:
  Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23
 
  When trying to post example\exampledocs\gb18030-example.xml using
 post.jar I get this error:
  % java -jar post.jar gb18030-example.xml jar gb18030-example.xml
  SimplePostTool: version 1.3
  SimplePostTool: POSTing files to http://localhost:8983/solr/update..
  SimplePostTool: POSTing file gb18030-example.xml
  SimplePostTool: FATAL: Solr returned an error #400 Unsupported
  encoding: GB18030lap
 
  From the stack it is caused by com.ctc.wstx.exc.WstxIOException:
  Unsupported encoding: GB18030
 
  The same works on my MacBook with Java1.6.0_24
 
  Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
  don't have XP around any longer to see if that's the factor somehow.
 
 
  Its worth mentioning, there is no guarantee the JRE will support
  GB18030 encoding.
 
  There are only 6 charsets guaranteed to exist:
  http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.
  html
 
 Indexing *.xml is a very common thing for new users to do.
 If this is likely to fail for enough users, we should move, remove, or at
least
 change the filename to something like gb18030-example.xml.gb18030 so it
 won't get picked up by accident.
 
 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-
 26, San Francisco
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Questions about 3.1.0 release, SVN and common-build.xml

2011-04-01 Thread Steven A Rowe

Shai, the source jars are available from the maven central repo, e.g.:

http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/3.1.0/lucene-core-3.1.0-sources.jar

-Original Message-
From: Shai Erera [mailto:ser...@gmail.com]
Sent: Friday, April 01, 2011 11:12 AM
To: dev@lucene.apache.org
Subject: Re: Questions about 3.1.0 release, SVN and common-build.xml

Ok, we can keep SNAPSHOT. I was just thinking it'll be nice if the tag
sets the right version, for convenience. It's not so hard to do.

BTW, I don't build the code, just run jar-src so I can attach the
source to the jars for debugging purposes. If we had packaged them
already (not that I propose that we do that), I wouldn't be
downloading the source at all.

Thanks,
Shai

On Friday, April 1, 2011, Uwe Schindler u...@thetaphi.de wrote:
+1.

In all previous releases we were leaving the -dev in the common-
build.xml, it's simply now -SNAPSHOT. Whenever somebody compiles the code
himself, there might be changes in it so the reproduced JAR files are not
identical to the released ones.

This was the same for all releases before (at least since 2.9.0 where we
had the discussion, too).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu]
Sent: Friday, April 01, 2011 3:19 PM
To: dev@lucene.apache.org
Subject: RE: Questions about 3.1.0 release, SVN and common-build.xml

Hi Shai,

On 4/1/2011 at 8:32 AM, Shai Erera wrote:
Also, the common-build.xml under the tag and the downloaded sources
specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found
this: ... and the default version in lucene/common-build.xml on the
branch to X.Y (remove the -SNAPSHOT suffix), so I guess 'SNAPSHOT'
should have been removed, but also version should be set to 3.1.0.

I'm pretty sure the ReleaseTodo page is wrong on this.

Building from a source distribution should *not* produce artifacts that
have
the same version in their names as the binary release. We don't want
same-
version-but-different binary artifacts being accidentally
produced. There's
nothing stopping people from changing this themselves, of course, so
leaving
the pre-release version name, including the -SNAPSHOT suffix, in the
source release is just a passive defense against this kind of mistake.

I'll change the ReleaseTodo page if there are no objections.

Steve

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3003) Move UnInvertedField into Lucene core

2011-04-01 Thread Yonik Seeley (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014703#comment-13014703
]

Yonik Seeley commented on LUCENE-3003:
--

bq. Attached: 32-bit results

Ah, bummer. It's every 8 bytes, but with a 4 byte offset!
I guess we could make it based on if we detect 32 vs 64 bit jvm... but maybe
first see if anyone has any ideas about how to use something like pagedbytes
instead.

Move UnInvertedField into Lucene core
-

Key: LUCENE-3003
URL: https://issues.apache.org/jira/browse/LUCENE-3003
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3003.patch, LUCENE-3003.patch,
byte_size_32-bit-openjdk6.txt

Solr's UnInvertedField lets you quickly lookup all terms ords for a
given doc/field.
Like, FieldCache, it inverts the index to produce this, and creates a
RAM-resident data structure holding the bits; but, unlike FieldCache,
it can handle multiple values per doc, and, it does not hold the term
bytes in RAM. Rather, it holds only term ords, and then uses
TermsEnum to resolve ord - term.
This is great eg for faceting, where you want to use int ords for all
of your counting, and then only at the end you need to resolve the
top N ords to their text.
I think this is a useful core functionality, and we should move most
of it into Lucene's core. It's a good complement to FieldCache. For
this first baby step, I just move it into core and refactor Solr's
usage of it.
After this, as separate issues, I think there are some things we could
explore/improve:
* The first-pass that allocates lots of tiny byte[] looks like it
could be inefficient. Maybe we could use the byte slices from the
indexer for this...
* We can improve the RAM efficiency of the TermIndex: if the codec
supports ords, and we are operating on one segment, we should just
use it. If not, we can use a more RAM-efficient data structure,
eg an FST mapping to the ord.
* We may be able to improve on the main byte[] representation by
using packed ints instead of delta-vInt?
* Eventually we should fold this ability into docvalues, ie we'd
write the byte[] image at indexing time, and then loading would be
fast, instead of uninverting

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2052) Allow for a list of filter queries and a single docset filter in QueryComponent

2011-04-01 Thread tylerw (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tylerw updated SOLR-2052:
-

Attachment: SOLR-2052-4.patch

Updated patch that applies cleanly against trunk and works with groups/field 
collapsing features.

 Allow for a list of filter queries and a single docset filter in 
 QueryComponent
 ---

 Key: SOLR-2052
 URL: https://issues.apache.org/jira/browse/SOLR-2052
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0
 Environment: Mac OS X, Java 1.6
Reporter: Stephen Green
Priority: Minor
 Fix For: Next

 Attachments: SOLR-2052-2.patch, SOLR-2052-3.patch, SOLR-2052-4.patch, 
 SOLR-2052.patch


 SolrIndexSearcher.QueryCommand allows you to specify a list of filter queries 
 or a single filter (as a DocSet), but not both.  This restriction seems 
 arbitrary, and there are cases where we can have both a list of filter 
 queries and a DocSet generated by some other non-query process (e.g., 
 filtering documents according to IDs pulled from some other source like a 
 database.)
 Fixing this requires a few small changes to SolrIndexSearcher to allow both 
 of these to be set for a QueryCommand and to take both into account when 
 evaluating the query.  It also requires a modification to ResponseBuilder to 
 allow setting the single filter at query time.
 I've run into this against 1.4, but the same holds true for the trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Questions about 3.1.0 release, SVN and common-build.xml

Thanks !

Shai

On Friday, April 1, 2011, Steven A Rowe sar...@syr.edu wrote:
 Shai, the source jars are available from the maven central repo, e.g.:

 http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/3.1.0/lucene-core-3.1.0-sources.jar

 -Original Message-
 From: Shai Erera [mailto:ser...@gmail.com]
 Sent: Friday, April 01, 2011 11:12 AM
 To: dev@lucene.apache.org
 Subject: Re: Questions about 3.1.0 release, SVN and common-build.xml

 Ok, we can keep SNAPSHOT. I was just thinking it'll be nice if the tag
 sets the right version, for convenience. It's not so hard to do.

 BTW, I don't build the code, just run jar-src so I can attach the
 source to the jars for debugging purposes. If we had packaged them
 already (not that I propose that we do that), I wouldn't be
 downloading the source at all.

 Thanks,
 Shai

 On Friday, April 1, 2011, Uwe Schindler u...@thetaphi.de wrote:
  +1.
 
  In all previous releases we were leaving the -dev in the common-
 build.xml, it's simply now -SNAPSHOT. Whenever somebody compiles the code
 himself, there might be changes in it so the reproduced JAR files are not
 identical to the released ones.
 
  This was the same for all releases before (at least since 2.9.0 where we
 had the discussion, too).
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Steven A Rowe [mailto:sar...@syr.edu]
  Sent: Friday, April 01, 2011 3:19 PM
  To: dev@lucene.apache.org
  Subject: RE: Questions about 3.1.0 release, SVN and common-build.xml
 
  Hi Shai,
 
  On 4/1/2011 at 8:32 AM, Shai Erera wrote:
   Also, the common-build.xml under the tag and the downloaded sources
   specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found
   this: ... and the default version in lucene/common-build.xml on the
   branch to X.Y (remove the -SNAPSHOT suffix), so I guess 'SNAPSHOT'
   should have been removed, but also version should be set to 3.1.0.
 
  I'm pretty sure the ReleaseTodo page is wrong on this.
 
  Building from a source distribution should *not* produce artifacts that
 have
  the same version in their names as the binary release.  We don't want
 same-
  version-but-different binary artifacts being accidentally
 produced.  There's
  nothing stopping people from changing this themselves, of course, so
 leaving
  the pre-release version name, including the -SNAPSHOT suffix, in the
  source release is just a passive defense against this kind of mistake.
 
  I'll change the ReleaseTodo page if there are no objections.
 
  Steve
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Questions about 3.1.0 release, SVN and common-build.xml

2011-04-01 Thread Uwe Schindler

Yeah, have noticed this shortly ago, too.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Shai Erera [mailto:ser...@gmail.com]
 Sent: Friday, April 01, 2011 5:57 PM
 To: dev@lucene.apache.org
 Subject: Re: Questions about 3.1.0 release, SVN and common-build.xml
 
 Thanks !
 
 Shai
 
 On Friday, April 1, 2011, Steven A Rowe sar...@syr.edu wrote:
  Shai, the source jars are available from the maven central repo, e.g.:
 
  http://repo2.maven.org/maven2/org/apache/lucene/lucene-
 core/3.1.0/luce
  ne-core-3.1.0-sources.jar
 
  -Original Message-
  From: Shai Erera [mailto:ser...@gmail.com]
  Sent: Friday, April 01, 2011 11:12 AM
  To: dev@lucene.apache.org
  Subject: Re: Questions about 3.1.0 release, SVN and common-build.xml
 
  Ok, we can keep SNAPSHOT. I was just thinking it'll be nice if the
  tag sets the right version, for convenience. It's not so hard to do.
 
  BTW, I don't build the code, just run jar-src so I can attach the
  source to the jars for debugging purposes. If we had packaged them
  already (not that I propose that we do that), I wouldn't be
  downloading the source at all.
 
  Thanks,
  Shai
 
  On Friday, April 1, 2011, Uwe Schindler u...@thetaphi.de wrote:
   +1.
  
   In all previous releases we were leaving the -dev in the common-
  build.xml, it's simply now -SNAPSHOT. Whenever somebody compiles the
  code himself, there might be changes in it so the reproduced JAR
  files are not identical to the released ones.
  
   This was the same for all releases before (at least since 2.9.0
   where we
  had the discussion, too).
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Steven A Rowe [mailto:sar...@syr.edu]
   Sent: Friday, April 01, 2011 3:19 PM
   To: dev@lucene.apache.org
   Subject: RE: Questions about 3.1.0 release, SVN and common-
 build.xml
  
   Hi Shai,
  
   On 4/1/2011 at 8:32 AM, Shai Erera wrote:
Also, the common-build.xml under the tag and the downloaded
 sources
specifies version to be 3.1-SNAPSHOT. On the ReleaseTodo I found
this: ... and the default version in lucene/common-build.xml on
the
branch to X.Y (remove the -SNAPSHOT suffix), so I guess
 'SNAPSHOT'
should have been removed, but also version should be set to 3.1.0.
  
   I'm pretty sure the ReleaseTodo page is wrong on this.
  
   Building from a source distribution should *not* produce artifacts
that
  have
   the same version in their names as the binary release.  We don't
want
  same-
   version-but-different binary artifacts being accidentally
  produced.  There's
   nothing stopping people from changing this themselves, of course, so
  leaving
   the pre-release version name, including the -SNAPSHOT suffix, in
the
   source release is just a passive defense against this kind of
mistake.
  
   I'll change the ReleaseTodo page if there are no objections.
  
   Steve
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3003) Move UnInvertedField into Lucene core

2011-04-01 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014723#comment-13014723
]

Michael McCandless commented on LUCENE-3003:

bq. It is inefficient - but I never saw a way around it since the lists are all
being built in parallel (due to the fact that we are uninverting).

Lucene's indexer (TermsHashPerField) has precisely this same problem
-- every unique term must point to two (well, one if omitTFAP)
growable byte arrays. We use slices into a single big (paged)
byte[], where first slice is tiny and can only hold like 5 bytes, but
then points to the next slice which is a bit bigger, etc.

We could look @ refactoring that for this use too...

Though this is just the one-time startup cost.

bq. Another small easy optimization I hadn't gotten around to yet was to
lower the indexIntervalBits and make it configurable.

I did make it configurable to the Lucene class (you can pass it in to
ctor), but for Solr I left it using every 128th term.

{quote}
Another small optimization would be to store an array of offsets to
length-prefixed byte arrays, rather than a BytesRef[]. At least the values are
already in packed byte arrays via PagedBytes.
{quote}

Both FieldCache and docvalues (branch) store an array-of-terms like
this (the array of offsets is packed ints).

We should also look at using an FST, which'd be the most compact but
the ord - term lookup cost goes up.

Anyway I think we can pursue these cool ideas on new [future]
issues...

Move UnInvertedField into Lucene core
-

Attachments: LUCENE-3003.patch, LUCENE-3003.patch,
byte_size_32-bit-openjdk6.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks

2011-04-01 Thread Michael Busch (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014724#comment-13014724
]

Michael Busch commented on LUCENE-2573:
---

Awesome speedup! Finally all this work shows great results!!

What's surprising is that the merge time is lower with DWPT. How can that be,
considering we're doing more merges?

Tiered flushing of DWPTs by RAM with low/high water marks
-

Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch,
LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch,
LUCENE-2573.patch, LUCENE-2573.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-1076) Allow MergePolicy to select non-contiguous merges

2011-04-01 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-1076:
---

Attachment: LUCENE-1076.patch

Phew, this patch almost fell below the event horizon of my TODO list...

I'm attaching new modernized one. I also mod'd the policy to not select two
max-sized merges at once. I think it's ready to commit...

Allow MergePolicy to select non-contiguous merges
-

Key: LUCENE-1076
URL: https://issues.apache.org/jira/browse/LUCENE-1076
Project: Lucene - Java
Issue Type: Improvement
Components: Index
Affects Versions: 2.3
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-1076.patch, LUCENE-1076.patch, LUCENE-1076.patch

I started work on this but with LUCENE-1044 I won't make much progress
on it for a while, so I want to checkpoint my current state/patch.
For backwards compatibility we must leave the default MergePolicy as
selecting contiguous merges. This is necessary because some
applications rely on temporal monotonicity of doc IDs, which means
even though merges can re-number documents, the renumbering will
always reflect the order in which the documents were added to the
index.
Still, for those apps that do not rely on this, we should offer a
MergePolicy that is free to select the best merges regardless of
whether they are continuguous. This requires fixing IndexWriter to
accept such a merge, and, fixing LogMergePolicy to optionally allow
it the freedom to do so.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3006) Javadocs warnings should fail the build


 [ 
https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-3006:


Attachment: LUCENE-3006-modules-javadoc-warning-cleanup.patch

Patch annihilating modules/ javadoc warnings (in analysis/icu/ and benchmark/).

Committing shortly.

 Javadocs warnings should fail the build
 ---

 Key: LUCENE-3006
 URL: https://issues.apache.org/jira/browse/LUCENE-3006
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.2, 4.0
Reporter: Grant Ingersoll
 Attachments: LUCENE-3006-javadoc-warning-cleanup.patch, 
 LUCENE-3006-modules-javadoc-warning-cleanup.patch, LUCENE-3006.patch, 
 LUCENE-3006.patch, LUCENE-3006.patch


 We should fail the build when there are javadocs warnings, as this should not 
 be the Release Manager's job to fix all at once right before the release.
 See 
 http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3006) Javadocs warnings should fail the build


[ 
https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014767#comment-13014767
 ] 

Steven Rowe commented on LUCENE-3006:
-

bq. Patch annihilating modules/ javadoc warnings (in analysis/icu/ and 
benchmark/).

Committed on trunk r1087830.

 Javadocs warnings should fail the build
 ---

 Key: LUCENE-3006
 URL: https://issues.apache.org/jira/browse/LUCENE-3006
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.2, 4.0
Reporter: Grant Ingersoll
 Attachments: LUCENE-3006-javadoc-warning-cleanup.patch, 
 LUCENE-3006-modules-javadoc-warning-cleanup.patch, LUCENE-3006.patch, 
 LUCENE-3006.patch, LUCENE-3006.patch


 We should fail the build when there are javadocs warnings, as this should not 
 be the Release Manager's job to fix all at once right before the release.
 See 
 http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #76: POMs out of sync

2011-04-01 Thread Steven A Rowe

This build failed because of javadocs warnings under modules/.  I committed 
fixes under LUCENE-3006.

I guess the nightly Ant Lucene trunk build doesn't build modules/ javadocs?

Steve

 -Original Message-
 From: Apache Hudson Server [mailto:hud...@hudson.apache.org]
 Sent: Friday, April 01, 2011 10:32 AM
 To: dev@lucene.apache.org
 Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #76: POMs out of sync
 
 Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/76/
 
 No tests ran.
 
 Build Log (for compile errors):
 [...truncated 9507 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2061) Generate jar containing test classes.


 [ 
https://issues.apache.org/jira/browse/SOLR-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved SOLR-2061.
---

Resolution: Fixed

Committed:
- trunk: r1087722, r1087723, r1087834
- branch_3x: r1087833

 Generate jar containing test classes.
 -

 Key: SOLR-2061
 URL: https://issues.apache.org/jira/browse/SOLR-2061
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1
Reporter: Drew Farris
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch, 
 SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch


 Follow-on to LUCENE-2609 for the solr build -- it would be useful to generate 
 and deploy a jar contaiing the test classes so other projects could write 
 unit tests using the framework in Solr. 
 This may take care of SOLR-717 as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2061) Generate jar containing test classes.


[ 
https://issues.apache.org/jira/browse/SOLR-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014773#comment-13014773
 ] 

Steven Rowe edited comment on SOLR-2061 at 4/1/11 6:02 PM:
---

Committed:
- trunk: r1087722, r1087723, r1087834
- branch_3x: r1087833, r1087834

  was (Author: steve_rowe):
Committed:
- trunk: r1087722, r1087723, r1087834
- branch_3x: r1087833
  
 Generate jar containing test classes.
 -

 Key: SOLR-2061
 URL: https://issues.apache.org/jira/browse/SOLR-2061
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1
Reporter: Drew Farris
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch, 
 SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch


 Follow-on to LUCENE-2609 for the solr build -- it would be useful to generate 
 and deploy a jar contaiing the test classes so other projects could write 
 unit tests using the framework in Solr. 
 This may take care of SOLR-717 as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2061) Generate jar containing test classes.


[ 
https://issues.apache.org/jira/browse/SOLR-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014773#comment-13014773
 ] 

Steven Rowe edited comment on SOLR-2061 at 4/1/11 6:04 PM:
---

Committed:
- trunk: r1087722, r1087723, r1087834
- branch_3x: r1087833

  was (Author: steve_rowe):
Committed:
- trunk: r1087722, r1087723, r1087834
- branch_3x: r1087833, r1087834
  
 Generate jar containing test classes.
 -

 Key: SOLR-2061
 URL: https://issues.apache.org/jira/browse/SOLR-2061
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1
Reporter: Drew Farris
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch, 
 SOLR-2061.patch, SOLR-2061.patch, SOLR-2061.patch


 Follow-on to LUCENE-2609 for the solr build -- it would be useful to generate 
 and deploy a jar contaiing the test classes so other projects could write 
 unit tests using the framework in Solr. 
 This may take care of SOLR-717 as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6613 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6613/

1 tests failed.
FAILED:  org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2746)
at java.util.ArrayList.ensureCapacity(ArrayList.java:187)
at java.util.ArrayList.add(ArrayList.java:378)
at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:60)
at 
org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutputStream.java:132)
at 
org.apache.lucene.store.RAMOutputStream.copyBytes(RAMOutputStream.java:171)
at 
org.apache.lucene.store.MockIndexOutputWrapper.copyBytes(MockIndexOutputWrapper.java:155)
at 
org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:222)
at 
org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:188)
at 
org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:140)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3195)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2828)
at 
org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1747)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1742)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1738)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2457)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1180)
at 
org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting(TestIndexWriter.java:2688)




Build Log (for compile errors):
[...truncated 3159 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6615 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6615/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8750 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6616 - Still Failing

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6616/

1 tests failed.
FAILED:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8744 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2444) Update fl syntax to support: pseudo fields, AS, transformers, and wildcards


 [ 
https://issues.apache.org/jira/browse/SOLR-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-2444:


Summary: Update fl syntax to support: pseudo fields, AS, transformers, and 
wildcards  (was: support wildcards in fl parameter, improve DocTransformer 
parsing)

I just started a new branch and implemented some of the things we have 
suggested.  Check:
https://svn.apache.org/repos/asf/lucene/dev/branches/pseudo/

This implements:

h3. SQL style AS
{code}
?fl=id,field AS display
{code}
will display 'field' with the name 'display'


h3. Pseudo Fields

You can define pseudo fields with  ?hl.pseudo=key:value

Any key that matches something in the fl param gets replaced with value.  For 
example:
{code}
?fl=id,pricefl.pseudo=price:real_price_field
{code}
is the same as
{code}
?fl=id,real_price_field AS price
{code}


h3. Transformer Syntax [name]

The previous underscore syntax is replaced with brackets.
{code}
?fl=id,[value:10] AS 10
{code}

Hopefully this will make it more clear that it is calling a function.










 Update fl syntax to support: pseudo fields, AS, transformers, and wildcards
 ---

 Key: SOLR-2444
 URL: https://issues.apache.org/jira/browse/SOLR-2444
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Attachments: SOLR-2444-fl-parsing.patch, SOLR-2444-fl-parsing.patch


 The ReturnFields parsing needs to be improved.  It should also support 
 wildcards

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2444) Update fl syntax to support: pseudo fields, AS, transformers, and wildcards


[ 
https://issues.apache.org/jira/browse/SOLR-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014838#comment-13014838
 ] 

Ryan McKinley commented on SOLR-2444:
-

Just commited the changes -- yonik, i replaced your fancy parsing with 
something i can understand (StringTokenizer and indexof)

I figure we should agree on a syntax first, and then optimize the fl parsing  
(out of my league)


 Update fl syntax to support: pseudo fields, AS, transformers, and wildcards
 ---

 Key: SOLR-2444
 URL: https://issues.apache.org/jira/browse/SOLR-2444
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Attachments: SOLR-2444-fl-parsing.patch, SOLR-2444-fl-parsing.patch


 The ReturnFields parsing needs to be improved.  It should also support 
 wildcards

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-01 Thread David Smiley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014841#comment-13014841
]

David Smiley commented on SOLR-2155:

To anyone listening: I'll continue to support my latest patch here with any bug
fixes or basic things. As of today I'll principally be working directly with
Ryan McKinley on his lucene-spatial-playground code-base. He ported my patch
to this framework as the predominant means of searching for points (single or
multi-value) and I'm going to finish what he started. This new framework is
superior to the geospatial mess in Lucene/Solr right now (no offense to any
involved). It won't be long before it's ready for broad use as a replacement
for anything existing. I look forward to exploring new indexing techniques
with this framework, and for it to eventually become part of Lucene/Solr.

Geospatial search using geohash prefixes

Key: SOLR-2155
URL: https://issues.apache.org/jira/browse/SOLR-2155
Project: Solr
Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch,
GeoHashPrefixFilter.patch,
SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch,
SOLR.2155.p3tests.patch

There currently isn't a solution in Solr for doing geospatial filtering on
documents that have a variable number of points. This scenario occurs when
there is location extraction (i.e. via a gazateer) occurring on free text.
None, one, or many geospatial locations might be extracted from any given
document and users want to limit their search results to those occurring in a
user-specified area.
I've implemented this by furthering the GeoHash based work in Lucene/Solr
with a geohash prefix based filter. A geohash refers to a lat-lon box on the
earth. Each successive character added further subdivides the box into a 4x8
(or 8x4 depending on the even/odd length of the geohash) grid. The first
step in this scheme is figuring out which geohash grid squares cover the
user's search query. I've added various extra methods to GeoHashUtils (and
added tests) to assist in this purpose. The next step is an actual Lucene
Filter, GeoHashPrefixFilter, that uses these geohash prefixes in
TermsEnum.seek() to skip to relevant grid squares in the index. Once a
matching geohash grid is found, the points therein are compared against the
user's query to see if it matches. I created an abstraction GeoShape
extended by subclasses named PointDistance... and CartesianBox to support
different queried shapes so that the filter need not care about these details.
This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6619 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6619/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8764 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2433) Make FieldProperties bit masks protected


 [ 
https://issues.apache.org/jira/browse/SOLR-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-2433.
-

   Resolution: Fixed
Fix Version/s: 3.2
 Assignee: Ryan McKinley

 Make FieldProperties bit masks protected
 

 Key: SOLR-2433
 URL: https://issues.apache.org/jira/browse/SOLR-2433
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Trivial
 Fix For: 3.2, 4.0

 Attachments: SOLR-2433-ProtectedFieldProperties.patch


 bit mask values are now package protected, so we have to duplicate:
 {code:java}
   final static int INDEXED = 0x0001;
   final static int TOKENIZED   = 0x0002;
   final static int STORED  = 0x0004;
   final static int BINARY  = 0x0008;
   final static int OMIT_NORMS  = 0x0010;
   ...
 {code}
 to set these fields explicitly.  This is important for complex fields like 
 LatLonType and poly fields in general

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-332) Visibility of static int fields in FieldProperties should be increased to allow custom FieldTypes to use them


 [ 
https://issues.apache.org/jira/browse/SOLR-332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-332.


Resolution: Fixed

made protected in SOLR-2433

 Visibility of static int fields in FieldProperties should be increased to 
 allow custom FieldTypes to use them
 -

 Key: SOLR-332
 URL: https://issues.apache.org/jira/browse/SOLR-332
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Jonathan Woods
Priority: Minor

 Constants in org.apache.solr.schema aren't visible to classes outside that 
 package, yet they're useful e.g. for custom FieldTypes.  Could their 
 visibility be increased?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-01 Thread Lance Norskog (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014873#comment-13014873
]

Lance Norskog commented on SOLR-2155:
-

Excellent! Geo is a complex topic, too big for a one-man project.

Lance

Geospatial search using geohash prefixes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [Solr Wiki] Update of Troubleshooting by YonikSeeley

2011-04-01 Thread Chris Hostetter


I'm confused ... this isn't a troubleshooting page, it's a request for 
help diagnosing an error -- there's no tips/tricks/advice here, just 
someone getting confused between solr.xml and tomcat context files.

shouldn't we just delete this?

: The Troubleshooting page has been changed by YonikSeeley.
: The comment on this change is: add troubleshooting page.
: http://wiki.apache.org/solr/Troubleshooting
: 
: --
: 
: New page:
:  
:  * [[Troubleshooting HTTP Status 404 - missing core name in path]]
: 

-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2335) FunctionQParser can't handle fieldnames containing whitespace

2011-04-01 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-2335:
---

Description: 
FunctionQParser has some simplistic assumptions about what types of field names 
it will deal with, in particular it can't deal with field names containing 
whitespaces.


  was:
We use an external file field configured as dynamic field. The dynamic field 
name (and so the name of the provided file) may contain spaces. But currently 
it is not possible to query for such fields. The following query results in a 
ParseException:
q=_val_:(experience_foo\ bar)

org.apache.lucene.queryParser.ParseException: Cannot parse 
'_val_:(experience_foo\ bar)': Expected ',' at position 15 in 'experience_foo 
bar'

We use following configuration for the externalFileField:
  types
...
fieldType name=experienceRankFile keyField=id defVal=0 
stored=false indexed=false class=solr.ExternalFileField
 valType=float/
  /types
  fields
dynamicField name=experience_* type=experienceRankFile /
...
  /field


Summary: FunctionQParser can't handle fieldnames containing whitespace  
(was: External file field name containing whitespace not supported)

Updating summary/description based on root of problem.

Description form original bug reporter...

{quote}
We use an external file field configured as dynamic field. The dynamic field 
name (and so the name of the provided file) may contain spaces. But currently 
it is not possible to query for such fields. The following query results in a 
ParseException:
q=_val_:(experience_foo\ bar)

org.apache.lucene.queryParser.ParseException: Cannot parse 
'_val_:(experience_foo\ bar)': Expected ',' at position 15 in 'experience_foo 
bar'

We use following configuration for the externalFileField:
  types
...
fieldType name=experienceRankFile keyField=id defVal=0 
stored=false indexed=false class=solr.ExternalFileField
 valType=float/
  /types
  fields
dynamicField name=experience_* type=experienceRankFile /
...
  /field
{quote}


The original reasons for these assumptions in FunctionQParser are still 
generally good: it helps keep the syntax and the parsing simpler then they 
would otherwise need to be.

I think an easy improvement we could make is to leave the current parsing logic 
the way it is, but provide a new FieldValueSourceParaser that expects a 
single (quoted) string as input, and just returns the FieldValueSource for that 
field.

So these two would be equivilent...

{code}
{!func}myFieldName
{!func}field(myFieldName)
{code}

...but it would also be possible to write...

{code}
{!func}field(1 my wacky Field*Name)
{code}


 FunctionQParser can't handle fieldnames containing whitespace
 -

 Key: SOLR-2335
 URL: https://issues.apache.org/jira/browse/SOLR-2335
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4.1
Reporter: Miriam Doelle
Priority: Minor

 FunctionQParser has some simplistic assumptions about what types of field 
 names it will deal with, in particular it can't deal with field names 
 containing whitespaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6624 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6624/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8754 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [Solr Wiki] Update of Troubleshooting by YonikSeeley

2011-04-01 Thread Yonik Seeley

On Fri, Apr 1, 2011 at 8:55 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 I'm confused ... this isn't a troubleshooting page, it's a request for
 help diagnosing an error -- there's no tips/tricks/advice here, just
 someone getting confused between solr.xml and tomcat context files.

 shouldn't we just delete this?

Heh - I only scanned it quick enough to realize it shouldn't be a
top-level link.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2335) FunctionQParser can't handle fieldnames containing whitespace

2011-04-01 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014935#comment-13014935
 ] 

Yonik Seeley commented on SOLR-2335:


oh, that's clever.  I like it!

 FunctionQParser can't handle fieldnames containing whitespace
 -

 Key: SOLR-2335
 URL: https://issues.apache.org/jira/browse/SOLR-2335
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4.1
Reporter: Miriam Doelle
Priority: Minor
 Attachments: SOLR-2335.patch


 FunctionQParser has some simplistic assumptions about what types of field 
 names it will deal with, in particular it can't deal with field names 
 containing whitespaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2335) FunctionQParser can't handle fieldnames containing whitespace

2011-04-01 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014936#comment-13014936
 ] 

Hoss Man commented on SOLR-2335:


the other thing this should make possible is sorting on fields that historicly 
havne't been sortable...

{code}
sort=field(1 my wacky Field*Name) desc
{code}

... the sort parsing code *could* even be optimized to detect when a function 
sort results in a FieldValueSource and swap it out with a regular sort ... but 
i'm not sure if there are any gotchas there.

 FunctionQParser can't handle fieldnames containing whitespace
 -

 Key: SOLR-2335
 URL: https://issues.apache.org/jira/browse/SOLR-2335
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4.1
Reporter: Miriam Doelle
Priority: Minor
 Attachments: SOLR-2335.patch


 FunctionQParser has some simplistic assumptions about what types of field 
 names it will deal with, in particular it can't deal with field names 
 containing whitespaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6613 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6613/

1 tests failed.
REGRESSION:  org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2894)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589)
at java.lang.StringBuffer.append(StringBuffer.java:337)
at 
java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617)
at 
org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93)
at 
org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304)
at 
org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1082)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1010)




Build Log (for compile errors):
[...truncated 5264 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6625 - Still Failing