[jira] [Updated] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Areek Zillur (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Areek Zillur updated LUCENE-5337:
-

Attachment: LUCENE-5337.patch

Initial Patch:
  - added payload support for FileDictionary
  - Improved javadocs
  - made field delimiter configurable
  - added tests for FileDictionary

> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Other MacOSX bug - was: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 999 - Failure!

2013-11-11 Thread Seán Coffey
Thanks for mail Rory. I actually ran into a similar issue last week 
which investigating a JNI/Exception reporting issue on Mac. We don't 
seem to have many similar reports in JBS (JIRA) - We may have a mac 
kernel issue. Will investigate further and log a bug if needed.


regards,
Sean.

On 09/11/2013 20:05, Rory O'Donnell wrote:

Sean,

Can you take a look at this?

Rgds, Rory



On 9 Nov 2013, at 18:59, Uwe Schindler  wrote:

Hi Rory,

in the last weeks (also on developer's machines) we see this bug on OSX in Java 
7:

http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/999/

   [junit4] java(186,0x14ebd5000) malloc: *** error for object 0x14ebc3f90: 
pointer being freed was not allocated
   [junit4] *** set a breakpoint in malloc_error_break to debug

This happens sometimes, but only on OSX (their malloc/free implementation in the libc 
seems to be more picky than Windows' or Linux's). The JVM crashes, but produces no hs_err 
file. I investigated your bug tracker, but all of those issues were closed as "not 
reproducible": http://goo.gl/dsZBrs

Can you reopen them or open a new one? This issue does not reproduce, but seems 
to be a major problem on OSX: This bug and the other one makes JDK 7 on OX 
unuseable for server apps (like app servers or Lucene search engine 
installations), because they crash randomly.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de



-Original Message-
From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
Sent: Saturday, November 09, 2013 1:33 AM
To: dev@lucene.apache.org; rjer...@apache.org
Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 999 -
Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/999/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 10462 lines...]
   [junit4] JVM J0: stderr was not empty, see:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-
core/test/temp/junit4-J0-20131108_235715_812.syserr
   [junit4] >>> JVM J0: stderr (verbatim) 
   [junit4] java(186,0x14ebd5000) malloc: *** error for object 0x14ebc3f90:
pointer being freed was not allocated
   [junit4] *** set a breakpoint in malloc_error_break to debug
   [junit4] <<< JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line:
/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/
java -XX:+UseCompressedOops -XX:+UseG1GC -
XX:+HeapDumpOnOutOfMemoryError -
XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=2CD7DE196210E842 -
Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -
Dtests.codec=random -Dtests.postingsformat=random -
Dtests.docvaluesformat=random -Dtests.locale=random -
Dtests.timezone=random -Dtests.directory=random -
Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -
Dtests.cleanthreads=perClass -
Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -
Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -
Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -
Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/solr/build/solr-core/test/temp -
Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/build/clover/db -
Djava.security.manager=org.apache.lucene.util.TestSecurityManager -
Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -
Djetty.testMode=1 -Djetty.insecurerandom=1 -
Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -
Djava.awt.headless=true -Dtests.disableHdfs=true -classpath
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-
core/classes/test:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/solr/build/solr-test-
framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/solr/test-framework/lib/junit4-ant-
2.0.13.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/solr/build/solr-core/test-files:/Users/jenkins/workspace/Lucene-
Solr-trunk-MacOSX/lucene/build/test-
framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/build/codecs/classes/java:/Users/jenkins/workspace/Lucen
e-Solr-trunk-MacOSX/solr/build/solr-
solrj/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/solr/build/solr-
core/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/build/analysis/common/lucene-analyzers-common-5.0-
SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/build/analysis/kuromoji/lucene-analyzers-kuromoji-5.0-
SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-
MacOSX/lucene/build/analysis/phonetic/lucene-analyzers-phonetic-5.0-
SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-
Ma

Re: Other MacOSX bug - was: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 999 - Failure!

2013-11-11 Thread Dawid Weiss
For what it's worth, this bug has been rare, but does show up from
time to time. Forget about reproducing it ;)

Dawid

On Mon, Nov 11, 2013 at 10:41 AM, Seán Coffey  wrote:
> Thanks for mail Rory. I actually ran into a similar issue last week which
> investigating a JNI/Exception reporting issue on Mac. We don't seem to have
> many similar reports in JBS (JIRA) - We may have a mac kernel issue. Will
> investigate further and log a bug if needed.
>
> regards,
> Sean.
>
>
> On 09/11/2013 20:05, Rory O'Donnell wrote:
>>
>> Sean,
>>
>> Can you take a look at this?
>>
>> Rgds, Rory
>>
>>
>>> On 9 Nov 2013, at 18:59, Uwe Schindler  wrote:
>>>
>>> Hi Rory,
>>>
>>> in the last weeks (also on developer's machines) we see this bug on OSX
>>> in Java 7:
>>>
>>> http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/999/
>>>
>>>[junit4] java(186,0x14ebd5000) malloc: *** error for object
>>> 0x14ebc3f90: pointer being freed was not allocated
>>>[junit4] *** set a breakpoint in malloc_error_break to debug
>>>
>>> This happens sometimes, but only on OSX (their malloc/free implementation
>>> in the libc seems to be more picky than Windows' or Linux's). The JVM
>>> crashes, but produces no hs_err file. I investigated your bug tracker, but
>>> all of those issues were closed as "not reproducible": http://goo.gl/dsZBrs
>>>
>>> Can you reopen them or open a new one? This issue does not reproduce, but
>>> seems to be a major problem on OSX: This bug and the other one makes JDK 7
>>> on OX unuseable for server apps (like app servers or Lucene search engine
>>> installations), because they crash randomly.
>>>
>>> Uwe
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>>
 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
 Sent: Saturday, November 09, 2013 1:33 AM
 To: dev@lucene.apache.org; rjer...@apache.org
 Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build #
 999 -
 Failure!

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/999/
 Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC

 All tests passed

 Build Log:
 [...truncated 10462 lines...]
[junit4] JVM J0: stderr was not empty, see:
 /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-
 core/test/temp/junit4-J0-20131108_235715_812.syserr
[junit4] >>> JVM J0: stderr (verbatim) 
[junit4] java(186,0x14ebd5000) malloc: *** error for object
 0x14ebc3f90:
 pointer being freed was not allocated
[junit4] *** set a breakpoint in malloc_error_break to debug
[junit4] <<< JVM J0: EOF 

 [...truncated 1 lines...]
[junit4] ERROR: JVM J0 ended with an exception, command line:
 /Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/
 java -XX:+UseCompressedOops -XX:+UseG1GC -
 XX:+HeapDumpOnOutOfMemoryError -
 XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=2CD7DE196210E842 -
 Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -
 Dtests.codec=random -Dtests.postingsformat=random -
 Dtests.docvaluesformat=random -Dtests.locale=random -
 Dtests.timezone=random -Dtests.directory=random -
 Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0
 -
 Dtests.cleanthreads=perClass -

 Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -
 Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -
 Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -
 Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/solr/build/solr-core/test/temp -
 Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/lucene/build/clover/db -
 Djava.security.manager=org.apache.lucene.util.TestSecurityManager -
 Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -
 Djetty.testMode=1 -Djetty.insecurerandom=1 -
 Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -
 Djava.awt.headless=true -Dtests.disableHdfs=true -classpath
 /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-
 core/classes/test:/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/solr/build/solr-test-
 framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/solr/test-framework/lib/junit4-ant-
 2.0.13.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-
 MacOSX/solr/build/solr-core/test-files:/Users/jenkins/workspace/Lucene-
 Solr-trunk-MacOSX/lucene/build/test-
 framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-
 Ma

[jira] [Commented] (LUCENE-5333) Support sparse faceting for heterogeneous indices

2013-11-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818801#comment-13818801
 ] 

Shai Erera commented on LUCENE-5333:


I talked with Gilad about it and he suggested a nice solution, with some 
limitations -- you can create whatever FacetRequest, e.g. CountFacetRequest 
over the ROOT category and set its depth to 2. That way, if we ask for 
numResults=10, you basically say "give me the top-10 dimensions (children of 
ROOT) and for each its top-10 children".

This isn't perfect as if you want to get all available dimensions you have to 
guess what numResults should be set to. And if you ask for a high number, e.g. 
100, you ask for the top-100 children of ROOT, and for each its top-100 
children. Still, you might not get all dimensions, but it's a very easy way to 
do this. No need for any custom code. Another limitation is that this is 
currently supported by TaxonomyFacetsAccumulator, but SortedSetDVAccumulator 
limits the depth to 1 for all given requests.

In that spirit, I can propose another solution - write a FacetResultsHandler 
which skips the first level of children and returns a FacetResult which has a 
tree structure, such that the first level are the dimensions and the second 
level are the actual children. That way, doing new CountFacetRequest(ROOT, 
10).setDepth(2) will result in all available dimensions in the first level, but 
top-10 for each in the second level. To implement such FacetResultsHandler we'd 
need to iterate over ROOT's children and compute the top-K for each, using e.g. 
DepthOneFacetResultsHandler...

> Support sparse faceting for heterogeneous indices
> -
>
> Key: LUCENE-5333
> URL: https://issues.apache.org/jira/browse/LUCENE-5333
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-5333.patch, LUCENE-5333.patch
>
>
> In some search apps, e.g. a large e-commerce site, the index can have
> a mix of wildly different product categories and facet dimensions, and
> the number of dimensions could be huge.
> E.g. maybe the index has shirts, computer memory, hard drives, etc.,
> and each of these many categories has different attributes.
> In such an index, when someone searches for "so dimm", which should
> match a bunch of laptop memory modules, you can't (easily) know up
> front which facet dimensions will be important.
> But, I think this is very easy for the facet module, since ords are
> stored "row stride" (each doc lists all facet labels it has), we could
> simply count all facets that the hits actually saw, and then in the
> end see which ones "got traction" and return facet results for these
> top dims.
> I'm not sure what the API would look like, but conceptually this
> should work very well, because of how the facet module works.
> You shouldn't have to state up front exactly which facet dimensions
> to count...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5333) Support sparse faceting for heterogeneous indices

2013-11-11 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5333:
---

Attachment: LUCENE-5333.patch

Patch add AllDimensionsFacetResultsHandler as a quick prototype to how this can 
be done. I also modified testTaxonomy to use it instead of 
AllFacetsAccumulator, and it passes.

If we want to proceed with this approach, we can do the following:

* Add a new AllDimensionsFacetRequest which either:
** Extends CountFacetRequest, but then we limit it to counting only
** Wraps another FacetRequest so that you can do any aggregation that you want.
** It setDepth(2) internally.
* Move FacetResultsHandler into FacetRequest, instead of 
TaxonomyFacetsAccumulator.createFacetResultsHandler. I'll admit that originally 
that's where it was (in FR), but I moved it to FA in order to simplify FR 
implementations. But perhaps it does belong w/ FR...

The only non-trivial part of this is that you get back a FacetResult, whose 
children are the actual results, so you cannot simply iterate on 
res.subResults, but need to realize you should iterate on each 
subResults.subResults. I don't know if this is considered as complicated or not 
(I didn't find it very complicating, but maybe I'm biased :)).

All-in-all, I think this is somewhat better than the accumulator approach, as 
it's more intuitive to define a FacetRequest, I think. In the faceted search 
module, FacetRequest == Query (in the content search jargon), and therefore 
more user-level than the underlying accumulator.

The downside is that it's not automatically supported by 
SortedSetDVAccumulator, since the latter doesn't respect any FacetRequest, only 
CountFacetRequest, and also does not let you specify your own 
FacetResultsHandler, but I think that that's solvable.

> Support sparse faceting for heterogeneous indices
> -
>
> Key: LUCENE-5333
> URL: https://issues.apache.org/jira/browse/LUCENE-5333
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-5333.patch, LUCENE-5333.patch, LUCENE-5333.patch
>
>
> In some search apps, e.g. a large e-commerce site, the index can have
> a mix of wildly different product categories and facet dimensions, and
> the number of dimensions could be huge.
> E.g. maybe the index has shirts, computer memory, hard drives, etc.,
> and each of these many categories has different attributes.
> In such an index, when someone searches for "so dimm", which should
> match a bunch of laptop memory modules, you can't (easily) know up
> front which facet dimensions will be important.
> But, I think this is very easy for the facet module, since ords are
> stored "row stride" (each doc lists all facet labels it has), we could
> simply count all facets that the hits actually saw, and then in the
> end see which ones "got traction" and return facet results for these
> top dims.
> I'm not sure what the API would look like, but conceptually this
> should work very well, because of how the facet module works.
> You shouldn't have to state up front exactly which facet dimensions
> to count...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5311) Make it possible to train / run classification over multiple fields

2013-11-11 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-5311:


Fix Version/s: 5.0
   4.6

> Make it possible to train / run classification over multiple fields
> ---
>
> Key: LUCENE-5311
> URL: https://issues.apache.org/jira/browse/LUCENE-5311
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> It'd be nice to be able to use multiple fields instead of just one for 
> training / running each classifier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5311) Make it possible to train / run classification over multiple fields

2013-11-11 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-5311.
-

Resolution: Fixed

for now resolving for naive bayes and knn, work on perceptron will come in a 
separate issue

> Make it possible to train / run classification over multiple fields
> ---
>
> Key: LUCENE-5311
> URL: https://issues.apache.org/jira/browse/LUCENE-5311
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> It'd be nice to be able to use multiple fields instead of just one for 
> training / running each classifier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5338) Let classifiers filter unlabeled documents during training

2013-11-11 Thread Tommaso Teofili (JIRA)
Tommaso Teofili created LUCENE-5338:
---

 Summary: Let classifiers filter unlabeled documents during training
 Key: LUCENE-5338
 URL: https://issues.apache.org/jira/browse/LUCENE-5338
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/classification
Affects Versions: 4.5
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.6


Only labeled (with 'class' field) documents should be used during training and 
therefore each classifier should filter (and not fail when run against) them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5333) Support sparse faceting for heterogeneous indices

2013-11-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818844#comment-13818844
 ] 

Shai Erera commented on LUCENE-5333:


Actually, if we move FacetResultsHandler to FacetRequest and create the new 
AllDimensionsFR, it doesn't need to setDepth() at all, only override 
createFacetResultsHandler. And we can add a flattenResults() method to 
AllDimsFR which takes a FacetResult and returns a List, to 
simplify app's life. Just an idea.

> Support sparse faceting for heterogeneous indices
> -
>
> Key: LUCENE-5333
> URL: https://issues.apache.org/jira/browse/LUCENE-5333
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-5333.patch, LUCENE-5333.patch, LUCENE-5333.patch
>
>
> In some search apps, e.g. a large e-commerce site, the index can have
> a mix of wildly different product categories and facet dimensions, and
> the number of dimensions could be huge.
> E.g. maybe the index has shirts, computer memory, hard drives, etc.,
> and each of these many categories has different attributes.
> In such an index, when someone searches for "so dimm", which should
> match a bunch of laptop memory modules, you can't (easily) know up
> front which facet dimensions will be important.
> But, I think this is very easy for the facet module, since ords are
> stored "row stride" (each doc lists all facet labels it has), we could
> simply count all facets that the hits actually saw, and then in the
> end see which ones "got traction" and return facet results for these
> top dims.
> I'm not sure what the API would look like, but conceptually this
> should work very well, because of how the facet module works.
> You shouldn't have to state up front exactly which facet dimensions
> to count...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5435) An edismax query wrapped in parentheses parsed wrong

2013-11-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818869#comment-13818869
 ] 

Anssi Törmä commented on SOLR-5435:
---

This may be related to SOLR-3377. At least that's where I found the workaround 
to leave a space between parentheses.

> An edismax query wrapped in parentheses parsed wrong
> 
>
> Key: SOLR-5435
> URL: https://issues.apache.org/jira/browse/SOLR-5435
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Affects Versions: 4.3.1
>Reporter: Anssi Törmä
>
> I have an edismax query with the following parameters:
> * q={{("jenkins " OR text:"jenkins")}}
> ** Yes, there is a space in {{"jenkins "}}
> * qf={{used_name^7 text}}
> Queries to the field {{used_name}} are analyzed like this
> {noformat}
> 
>   
>   pattern="(,|\s)+" 
>replacement=" "/>
>   
>   
> 
> {noformat}
> Queries to the field {{text}} are anayzed like this:
> {noformat}
> 
>   
>generateWordParts="0"
> generateNumberParts="0"
> catenateWords="1"
> catenateNumbers="0"
> catenateAll="0"
> preserveOriginal="1"/>
>   
>   
> 
> {noformat}
> In Solr admin console, I can see the query is parsed wrongly:
> {{+((used_name:jenkins^7.0 | text:jenkins) (used_name:text:^7.0 | (text:text: 
> text:text)) (used_name:jenkins^7.0 | text:jenkins))}}
> See that {{(text:text: text:text)}}?
> As a workaround I leave a space between parentheses and what they enclose, 
> i.e. q={{( "jenkins " OR text:"jenkins" )}}, then the query is parsed as I 
> expect, i.e.
> {{+((used_name:jenkins^7.0 | text:jenkins) text:jenkins)}}
> The query is also parsed correctly if there's no space in {{"jenkins"}}.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5435) An edismax query wrapped in parentheses parsed wrong

2013-11-11 Thread JIRA
Anssi Törmä created SOLR-5435:
-

 Summary: An edismax query wrapped in parentheses parsed wrong
 Key: SOLR-5435
 URL: https://issues.apache.org/jira/browse/SOLR-5435
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.3.1
Reporter: Anssi Törmä


I have an edismax query with the following parameters:
* q={{("jenkins " OR text:"jenkins")}}
** Yes, there is a space in {{"jenkins "}}
* qf={{used_name^7 text}}

Queries to the field {{used_name}} are analyzed like this
{noformat}

  
  
  
  

{noformat}
Queries to the field {{text}} are anayzed like this:
{noformat}

  
  
  
  

{noformat}

In Solr admin console, I can see the query is parsed wrongly:
{{+((used_name:jenkins^7.0 | text:jenkins) (used_name:text:^7.0 | (text:text: 
text:text)) (used_name:jenkins^7.0 | text:jenkins))}}
See that {{(text:text: text:text)}}?

As a workaround I leave a space between parentheses and what they enclose, i.e. 
q={{( "jenkins " OR text:"jenkins" )}}, then the query is parsed as I expect, 
i.e.
{{+((used_name:jenkins^7.0 | text:jenkins) text:jenkins)}}

The query is also parsed correctly if there's no space in {{"jenkins"}}.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5338) Let classifiers filter unlabeled documents during training

2013-11-11 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-5338:


Fix Version/s: 5.0

> Let classifiers filter unlabeled documents during training
> --
>
> Key: LUCENE-5338
> URL: https://issues.apache.org/jira/browse/LUCENE-5338
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/classification
>Affects Versions: 4.5
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> Only labeled (with 'class' field) documents should be used during training 
> and therefore each classifier should filter (and not fail when run against) 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5311) Make it possible to train / run classification over multiple fields

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818890#comment-13818890
 ] 

ASF subversion and git services commented on LUCENE-5311:
-

Commit 1540675 from [~teofili] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1540675 ]

LUCENE-5311 - backport to branch_4x

> Make it possible to train / run classification over multiple fields
> ---
>
> Key: LUCENE-5311
> URL: https://issues.apache.org/jira/browse/LUCENE-5311
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> It'd be nice to be able to use multiple fields instead of just one for 
> training / running each classifier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr Release Management Process

2013-11-11 Thread Furkan KAMACI
Hi;

I've resolved 2 issues last week. One of them is created by me and one of
them was an existence issue. Also there is an 3rd issue that is a
duplication of the second one.

When I create an issue I have a right to edit Fix Version/s. I've written
4.6 for fix version of first issue. Second issue was not created by me so I
can not edit the Fix Version/s.

I just wonder and want to learn commitment process of Solr project. What
committers do before a new release process start? If they filter the
resolved issues that has a Fix Version/s of new release they will not able
to see resolved issues. If they filter the issues resolved since the last
release then they are not using the benefits of Fix Version/s section.
People have a right to edit Fix Version/s section when they create an issue
but does not have a right to edit existence one (ones are created by other
people)

There are many issues at Solr project and frequent commits every day.
Should I point the user at comments (with an @ tag) for such kind of
situations (I follow who is responsible for next release from dev-list) or
do you handle it yourself (as like how you handled it since this time).

I just wanted to learn the internal process of release management.

Thanks;
Furkan KAMACI


[jira] [Commented] (LUCENE-4753) Make forbidden API checks per-module

2013-11-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818926#comment-13818926
 ] 

Markus Jelsma commented on LUCENE-4753:
---

Hi Uwe, i can't compile anymore since something was changed to build.xml

{code}
BUILD FAILED
Target "install-forbidden-apis" does not exist in the project "solr". It is 
used from target "check-forbidden-apis".

Total time: 0 seconds
{code}

> Make forbidden API checks per-module
> 
>
> Key: LUCENE-4753
> URL: https://issues.apache.org/jira/browse/LUCENE-4753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Critical
> Fix For: 4.6
>
> Attachments: LUCENE-4753.patch, LUCENE-4753.patch, LUCENE-4753.patch
>
>
> After the forbidden API checker was released separately from Lucene as a 
> Google Code project (forked and improved), including Maven support, the 
> checks on Lucene should be changed to work per-module.
> The reason for this is: The improved checker is more picky about e.g. 
> extending classes that are forbidden or overriding methods and calling 
> super.method() if they are on the forbidden signatures list. For these 
> checks, it is not enough to have the class files and the rt.jar, you need the 
> whole classpath. The forbidden APIs 1.0 now by default complains if classes 
> are missing from the classpath.
> It is very hard with the module architecture of Lucene/Solr, to make a 
> uber-classpath, instead the checks should be done per module, so the default 
> compile/test classpath of the module can be used and no crazy path statements 
> with **/*.jar are needed. This needs some refactoring in the exclusion lists, 
> but the Lucene checks could be done by a macro in common-build, that allows 
> custom exclusion lists for specific modules.
> Currently, the "strict" checking is disabled for Solr, so the checker only 
> complains about missing classes but does not fail the build:
> {noformat}
> -check-forbidden-java-apis:
> [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6
> [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6
> [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.1
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\servlet-api.txt
> [forbidden-apis] Loading classes to check...
> [forbidden-apis] Scanning for API signatures and dependencies...
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProvider' cannot be loaded. Please fix 
> the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.collation.ICUCollationKeyAnalyzer' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] Scanned 2177 (and 1222 related) class file(s) for forbidden 
> API invocations (in 1.80s), 0 error(s).
> {noformat}
> I added almost all missing jars, but those do not seem to be in the solr part 
> of the source tree (i think they are only copied when building artifacts). 
> With making the whole thing per module, we can use the default classpath of 
> the module which makes it much easier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4753) Make forbidden API checks per-module

2013-11-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818933#comment-13818933
 ] 

Uwe Schindler commented on LUCENE-4753:
---

Hi,

which build target did you call from where? This outdated target 
"install-forbidden-apis" no longer exists (it was renamed). I looks like you 
have a checkout with mixed svn revisions or you have changed some build.xml 
files yourself and they conflicted.

Be sure to:
# revert all changes (make sure you save your changes in a diff before doing 
this)
# update your checkout from the root folder (where lucene, dev-tools, and solr 
subfolders are visible). Updating only solr or lucene subfolder leads to 
inconsistency as dependencies inside ANT no longer work
# if nothing helps, use a fresh checkout and try again. You can apply the patch 
from step 1 to restore your changes.

Jenkins already verified that everything is fine. I cannot find any problems, 
too: I can call "ant compile/test/check-forbidden-apis/..." from everywhere and 
it works.

> Make forbidden API checks per-module
> 
>
> Key: LUCENE-4753
> URL: https://issues.apache.org/jira/browse/LUCENE-4753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Critical
> Fix For: 4.6
>
> Attachments: LUCENE-4753.patch, LUCENE-4753.patch, LUCENE-4753.patch
>
>
> After the forbidden API checker was released separately from Lucene as a 
> Google Code project (forked and improved), including Maven support, the 
> checks on Lucene should be changed to work per-module.
> The reason for this is: The improved checker is more picky about e.g. 
> extending classes that are forbidden or overriding methods and calling 
> super.method() if they are on the forbidden signatures list. For these 
> checks, it is not enough to have the class files and the rt.jar, you need the 
> whole classpath. The forbidden APIs 1.0 now by default complains if classes 
> are missing from the classpath.
> It is very hard with the module architecture of Lucene/Solr, to make a 
> uber-classpath, instead the checks should be done per module, so the default 
> compile/test classpath of the module can be used and no crazy path statements 
> with **/*.jar are needed. This needs some refactoring in the exclusion lists, 
> but the Lucene checks could be done by a macro in common-build, that allows 
> custom exclusion lists for specific modules.
> Currently, the "strict" checking is disabled for Solr, so the checker only 
> complains about missing classes but does not fail the build:
> {noformat}
> -check-forbidden-java-apis:
> [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6
> [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6
> [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.1
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\servlet-api.txt
> [forbidden-apis] Loading classes to check...
> [forbidden-apis] Scanning for API signatures and dependencies...
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProvider' cannot be loaded. Please fix 
> the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.collation.ICUCollationKeyAnalyzer' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] Scanned 2177 (and 1222 related) class file(s) for forbidden 
> API invocations (in 1.80s), 0 error(s).
> {noformat}
> I added almost all missing jars, but those do not seem to be in the solr part 
> of the source tree (i think they are only copied when building artifacts). 
> With making the whole thing per module, we can use the default classpath of 
> the module which makes it much easier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5338) Let classifiers filter unlabeled documents during training

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818938#comment-13818938
 ] 

ASF subversion and git services commented on LUCENE-5338:
-

Commit 1540703 from [~teofili] in branch 'dev/trunk'
[ https://svn.apache.org/r1540703 ]

LUCENE-5338 - avoid considering unlabeled documents for training

> Let classifiers filter unlabeled documents during training
> --
>
> Key: LUCENE-5338
> URL: https://issues.apache.org/jira/browse/LUCENE-5338
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/classification
>Affects Versions: 4.5
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> Only labeled (with 'class' field) documents should be used during training 
> and therefore each classifier should filter (and not fail when run against) 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4753) Make forbidden API checks per-module

2013-11-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818947#comment-13818947
 ] 

Markus Jelsma commented on LUCENE-4753:
---

ant example from /solr. I svn upped my checkout not long ago and got no updated 
build.xml. I upped again and i finally received your commit. Svn must be behind.
Thanks

> Make forbidden API checks per-module
> 
>
> Key: LUCENE-4753
> URL: https://issues.apache.org/jira/browse/LUCENE-4753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Critical
> Fix For: 4.6
>
> Attachments: LUCENE-4753.patch, LUCENE-4753.patch, LUCENE-4753.patch
>
>
> After the forbidden API checker was released separately from Lucene as a 
> Google Code project (forked and improved), including Maven support, the 
> checks on Lucene should be changed to work per-module.
> The reason for this is: The improved checker is more picky about e.g. 
> extending classes that are forbidden or overriding methods and calling 
> super.method() if they are on the forbidden signatures list. For these 
> checks, it is not enough to have the class files and the rt.jar, you need the 
> whole classpath. The forbidden APIs 1.0 now by default complains if classes 
> are missing from the classpath.
> It is very hard with the module architecture of Lucene/Solr, to make a 
> uber-classpath, instead the checks should be done per module, so the default 
> compile/test classpath of the module can be used and no crazy path statements 
> with **/*.jar are needed. This needs some refactoring in the exclusion lists, 
> but the Lucene checks could be done by a macro in common-build, that allows 
> custom exclusion lists for specific modules.
> Currently, the "strict" checking is disabled for Solr, so the checker only 
> complains about missing classes but does not fail the build:
> {noformat}
> -check-forbidden-java-apis:
> [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6
> [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6
> [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.1
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\servlet-api.txt
> [forbidden-apis] Loading classes to check...
> [forbidden-apis] Scanning for API signatures and dependencies...
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProvider' cannot be loaded. Please fix 
> the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.collation.ICUCollationKeyAnalyzer' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] Scanned 2177 (and 1222 related) class file(s) for forbidden 
> API invocations (in 1.80s), 0 error(s).
> {noformat}
> I added almost all missing jars, but those do not seem to be in the solr part 
> of the source tree (i think they are only copied when building artifacts). 
> With making the whole thing per module, we can use the default classpath of 
> the module which makes it much easier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5338) Let classifiers filter unlabeled documents during training

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818946#comment-13818946
 ] 

ASF subversion and git services commented on LUCENE-5338:
-

Commit 1540706 from [~teofili] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1540706 ]

LUCENE-5338 - backport to branch_4x

> Let classifiers filter unlabeled documents during training
> --
>
> Key: LUCENE-5338
> URL: https://issues.apache.org/jira/browse/LUCENE-5338
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/classification
>Affects Versions: 4.5
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> Only labeled (with 'class' field) documents should be used during training 
> and therefore each classifier should filter (and not fail when run against) 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5338) Let classifiers filter unlabeled documents during training

2013-11-11 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved LUCENE-5338.
-

Resolution: Fixed

> Let classifiers filter unlabeled documents during training
> --
>
> Key: LUCENE-5338
> URL: https://issues.apache.org/jira/browse/LUCENE-5338
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/classification
>Affects Versions: 4.5
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.6, 5.0
>
>
> Only labeled (with 'class' field) documents should be used during training 
> and therefore each classifier should filter (and not fail when run against) 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4753) Make forbidden API checks per-module

2013-11-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818970#comment-13818970
 ] 

Uwe Schindler commented on LUCENE-4753:
---

It is still strange, because the whole change was one single commit. So you 
would have either nothing or all of the changes. From your error message, it 
looks as you may have only updated the lucene folder and not solr. Because this 
old target was only existent in lucene/common-build.xml; if this file was 
updated, solr would no longer find it with the old name.

> Make forbidden API checks per-module
> 
>
> Key: LUCENE-4753
> URL: https://issues.apache.org/jira/browse/LUCENE-4753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Critical
> Fix For: 4.6
>
> Attachments: LUCENE-4753.patch, LUCENE-4753.patch, LUCENE-4753.patch
>
>
> After the forbidden API checker was released separately from Lucene as a 
> Google Code project (forked and improved), including Maven support, the 
> checks on Lucene should be changed to work per-module.
> The reason for this is: The improved checker is more picky about e.g. 
> extending classes that are forbidden or overriding methods and calling 
> super.method() if they are on the forbidden signatures list. For these 
> checks, it is not enough to have the class files and the rt.jar, you need the 
> whole classpath. The forbidden APIs 1.0 now by default complains if classes 
> are missing from the classpath.
> It is very hard with the module architecture of Lucene/Solr, to make a 
> uber-classpath, instead the checks should be done per module, so the default 
> compile/test classpath of the module can be used and no crazy path statements 
> with **/*.jar are needed. This needs some refactoring in the exclusion lists, 
> but the Lucene checks could be done by a macro in common-build, that allows 
> custom exclusion lists for specific modules.
> Currently, the "strict" checking is disabled for Solr, so the checker only 
> complains about missing classes but does not fail the build:
> {noformat}
> -check-forbidden-java-apis:
> [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6
> [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6
> [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.1
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\servlet-api.txt
> [forbidden-apis] Loading classes to check...
> [forbidden-apis] Scanning for API signatures and dependencies...
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProvider' cannot be loaded. Please fix 
> the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.collation.ICUCollationKeyAnalyzer' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] Scanned 2177 (and 1222 related) class file(s) for forbidden 
> API invocations (in 1.80s), 0 error(s).
> {noformat}
> I added almost all missing jars, but those do not seem to be in the solr part 
> of the source tree (i think they are only copied when building artifacts). 
> With making the whole thing per module, we can use the default classpath of 
> the module which makes it much easier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818969#comment-13818969
 ] 

Erick Erickson commented on LUCENE-5337:


Areek:

There are a couple of problems with this patch...

1> It won't compile in 4x since it uses some Java 7 constructs,
I stopped at the "diamond" bit. Unless this is intended for trunk only, could 
you fix these?

2> on trunk, running "ant precommit" shows the following errors. These are 
pretty easy to fix, just takes specifying the UTF-8 charset as I remember. 
They're all in the test code, but still

[forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7
[forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7
[forbidden-apis] Reading API signatures: 
/Users/Erick/apache/trunk_5337/lucene/tools/forbiddenApis/base.txt
[forbidden-apis] Loading classes to check...
[forbidden-apis] Scanning for API signatures and dependencies...
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses 
default charset]
[forbidden-apis]   in org.apache.lucene.search.suggest.FileDictionaryTest 
(FileDictionaryTest.java:76)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses 
default charset]
[forbidden-apis]   in org.apache.lucene.search.suggest.FileDictionaryTest 
(FileDictionaryTest.java:98)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses 
default charset]
[forbidden-apis]   in org.apache.lucene.search.suggest.FileDictionaryTest 
(FileDictionaryTest.java:120)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses 
default charset]
[forbidden-apis]   in org.apache.lucene.search.suggest.FileDictionaryTest 
(FileDictionaryTest.java:146)
[forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [Uses 
default charset]
[forbidden-apis]   in org.apache.lucene.search.suggest.FileDictionaryTest 
(FileDictionaryTest.java:173)
[forbidden-apis] Scanned 179 (and 405 related) class file(s) for forbidden API 
invocations (in 0.10s), 5 error(s).


I can take care of the secretarial stuff here and get this committed. I glanced 
over the code but don't know the area deeply to make any deeper comments, 
anyone want to chime in on that score?

> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4753) Make forbidden API checks per-module

2013-11-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818976#comment-13818976
 ] 

Markus Jelsma commented on LUCENE-4753:
---

I usually update both but perhaps i didn't this time or i didn't get all 
commits, the latter sometimes happens and then i have to update twice.
It's fixed now.

> Make forbidden API checks per-module
> 
>
> Key: LUCENE-4753
> URL: https://issues.apache.org/jira/browse/LUCENE-4753
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Critical
> Fix For: 4.6
>
> Attachments: LUCENE-4753.patch, LUCENE-4753.patch, LUCENE-4753.patch
>
>
> After the forbidden API checker was released separately from Lucene as a 
> Google Code project (forked and improved), including Maven support, the 
> checks on Lucene should be changed to work per-module.
> The reason for this is: The improved checker is more picky about e.g. 
> extending classes that are forbidden or overriding methods and calling 
> super.method() if they are on the forbidden signatures list. For these 
> checks, it is not enough to have the class files and the rt.jar, you need the 
> whole classpath. The forbidden APIs 1.0 now by default complains if classes 
> are missing from the classpath.
> It is very hard with the module architecture of Lucene/Solr, to make a 
> uber-classpath, instead the checks should be done per module, so the default 
> compile/test classpath of the module can be used and no crazy path statements 
> with **/*.jar are needed. This needs some refactoring in the exclusion lists, 
> but the Lucene checks could be done by a macro in common-build, that allows 
> custom exclusion lists for specific modules.
> Currently, the "strict" checking is disabled for Solr, so the checker only 
> complains about missing classes but does not fail the build:
> {noformat}
> -check-forbidden-java-apis:
> [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.6
> [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.6
> [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.1
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\executors.txt
> [forbidden-apis] Reading API signatures: C:\Users\Uwe 
> Schindler\Projects\lucene\trunk-lusolr3\lucene\tools\forbiddenApis\servlet-api.txt
> [forbidden-apis] Loading classes to check...
> [forbidden-apis] Scanning for API signatures and dependencies...
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProviderFactory' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.analysis.uima.ae.AEProvider' cannot be loaded. Please fix 
> the classpath!
> [forbidden-apis] WARNING: The referenced class 
> 'org.apache.lucene.collation.ICUCollationKeyAnalyzer' cannot be loaded. 
> Please fix the classpath!
> [forbidden-apis] Scanned 2177 (and 1222 related) class file(s) for forbidden 
> API invocations (in 1.80s), 0 error(s).
> {noformat}
> I added almost all missing jars, but those do not seem to be in the solr part 
> of the source tree (i think they are only copied when building artifacts). 
> With making the whole thing per module, we can use the default classpath of 
> the module which makes it much easier.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818982#comment-13818982
 ] 

Robert Muir commented on LUCENE-5337:
-

{quote}
1> It won't compile in 4x since it uses some Java 7 constructs,
I stopped at the "diamond" bit. Unless this is intended for trunk only, could 
you fix these?
{quote}

Erick: FYI trunk is on java7. So java7 syntax is actually welcome there and 
good to use. Its the committers job to remove such syntax when/if backporting 
to some branch that doesnt support java7.

> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5205) MoreLikeThis doesn't escape shard queries

2013-11-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819006#comment-13819006
 ] 

Markus Jelsma commented on SOLR-5205:
-

Shawn, i hadn't noticed bad performance for distributed MLT before until just 
now. It looks like that in a 5 shard cluster it fires about 12 queries of which 
most are really slow. Doing MLT on a not distributed node with a large amount 
of documents is lightning fast! Many of the queries fired are distrib=true.

> MoreLikeThis doesn't escape shard queries
> -
>
> Key: SOLR-5205
> URL: https://issues.apache.org/jira/browse/SOLR-5205
> Project: Solr
>  Issue Type: Bug
>  Components: MoreLikeThis
>Affects Versions: 4.4
>Reporter: Markus Jelsma
> Fix For: 4.6
>
> Attachments: SOLR-5205-trunk.patch, SOLR-5205.patch
>
>
> MoreLikeThis does not support Lucene special characters as ID in distributed 
> search. ID's containing special characters such as URL's need to be escaped 
> in the first place. They are then unescaped and get sent to shards in an 
> unescaped form, causing the org.apache.solr.search.SyntaxError exception.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #503: POMs out of sync

2013-11-11 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/503/

No tests ran.

Build Log:
[...truncated 23667 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-5427) SolrCloud leaking (many) filehandles to deleted files

2013-11-11 Thread Eric Bus (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Bus closed SOLR-5427.
--

Resolution: Not A Problem

This problem seems to be related to running SOLR inside a Tomcat server. I 
switched to the bundled Jetty, and the problems are gone. No open files after 
running the server for about 2 days. Normally, the first open file handles 
would appear in a few minutes or hours.

> SolrCloud leaking (many) filehandles to deleted files
> -
>
> Key: SOLR-5427
> URL: https://issues.apache.org/jira/browse/SOLR-5427
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3, 4.4, 4.5
> Environment: Debian Linux 6.0 running on VMWare
> Tomcat 6
>Reporter: Eric Bus
>
> I'm running SolrCloud on three nodes. I've been experiencing strange problems 
> on these nodes. The main problem is that my disk is filling up, because old 
> tlog files are not being released by SOLR.
> I suspect this problem is caused by a lot of open connectins between the 
> nodes in CLOSE_WAIT status. After running a node for only 2 days, the node 
> already has 33 connections and about 11.000 deleted files that are still open.
> I'm running about 100 cores on each nodes. Could this be causing the rate in 
> which things are going wrong? I suspect that on a setup with only 1 
> collection and 3 shards, the problem stays hidden for quite some time.
> lsof -p 15452 -n | grep -i tcp | grep CLOSE_WAIT
> java15452 root   45u  IPv6   706925770t0  TCP 
> 11.1.0.12:46533->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root   48u  IPv6   706925790t0  TCP 
> 11.1.0.12:46535->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root  205u  IPv6   727594340t0  TCP 
> 11.1.0.12:41744->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root  378u  IPv6   723591150t0  TCP 
> 11.1.0.12:44767->11.1.0.11:http-alt (CLOSE_WAIT)
> java15452 root  381u  IPv6   723591160t0  TCP 
> 11.1.0.12:44768->11.1.0.11:http-alt (CLOSE_WAIT)
> java15452 root 5252u  IPv6   727594450t0  TCP 
> 11.1.0.12:41751->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root 6193u  IPv6   740216510t0  TCP 
> 11.1.0.12:39170->11.1.0.11:http-alt (CLOSE_WAIT)
> java15452 root *150u  IPv6   740216480t0  TCP 
> 11.1.0.12:53865->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root *152u  IPv6   727594240t0  TCP 
> 11.1.0.12:41737->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *526u  IPv6   740279950t0  TCP 
> 11.1.0.12:53965->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root *986u  IPv6   727686370t0  TCP 
> 11.1.0.12:42246->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *626u  IPv6   727499830t0  TCP 
> 11.1.0.12:41297->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *476u  IPv6   727686330t0  TCP 
> 11.1.0.12:42243->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *567u  IPv6   727686220t0  TCP 
> 11.1.0.12:42234->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *732u  IPv6   727685990t0  TCP 
> 11.1.0.12:42230->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *799u  IPv6   727594270t0  TCP 
> 11.1.0.12:41739->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *259u  IPv6   727686260t0  TCP 
> 11.1.0.12:42237->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *272u  IPv6   727689970t0  TCP 
> 11.1.0.12:42263->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *493u  IPv6   727594070t0  TCP 
> 11.1.0.12:41729->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *693u  IPv6   740209090t0  TCP 
> 11.1.0.12:53853->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root *740u  IPv6   727499960t0  TCP 
> 11.1.0.12:41306->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *749u  IPv6   739752300t0  TCP 
> 11.1.0.12:38825->11.1.0.11:http-alt (CLOSE_WAIT)
> java15452 root *750u  IPv6   739746190t0  TCP 
> 11.1.0.12:53499->11.1.0.13:http-alt (CLOSE_WAIT)
> java15452 root *771u  IPv6   727594200t0  TCP 
> 11.1.0.12:41734->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *793u  IPv6   727686530t0  TCP 
> 11.1.0.12:42256->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *900u  IPv6   727686180t0  TCP 
> 11.1.0.12:42233->11.1.0.12:http-alt (CLOSE_WAIT)
> java15452 root *045u  IPv6   727664770t0  TCP 
> 11.1.0.12:41181->11.1.0.11:http-alt (CLOSE_WAIT)
> jav

[jira] [Commented] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819038#comment-13819038
 ] 

Erick Erickson commented on LUCENE-5337:


Robert:

OK, I didn't realize that was the case, I was guessing that there'd be some
kind of cutover point, probably where we decided to move development pretty
much to trunk and start backporting fewer JIRAs...

That said, things like diamonds etc. are trivial, I'm quite willing to do
those. More complicated things and I'll be willing to do on a case-by-case
basis, depending probably on how adventurous I'm feeling at the time and
how complex the code rearrangement looks.

Erick





> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Estimating peak memory use for UnInvertedField faceting

2013-11-11 Thread Tom Burton-West
Thanks Otis,

 I'm looking forward to the presentation videos.

I'll look into using DocValues.Re-indexing 200 million docs will take a
while though :).
Will Solr automatically use DocValues for faceting if you have DocValues
for the field or is there some configuration or parameter that needs to be
set?

Tom


On Sat, Nov 9, 2013 at 9:57 AM, Otis Gospodnetic  wrote:

> Hi Tom,
>
> Check http://blog.sematext.com/2013/11/09/presentation-solr-for-analytics/
> .  It includes info about our experiment with DocValues, which clearly
> shows lower heap usage, which means you'll get further without getting
> this OOM.  In our experiments we didn't sort, facet, or group, and I
> see you are faceting, which means that DocValues, which are more
> efficient than FieldCache, should help you even more than it helped
> us.
>
> The graphs are from SPM, which you could use to monitor your Solr
> cluster, at least while you are tuning it.
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, Nov 8, 2013 at 2:41 PM, Tom Burton-West 
> wrote:
> > Hi Yonik,
> >
> > I don't know enough about JVM tuning and monitoring to do this in a clean
> > way, so I just tried setting the max heap at 8GB and then 6GB to force
> > garbage collection.  With it set to 6GB it goes into  a long GC loop and
> > then runs out of heap (See below) .  The stack trace says the issue is
> with
> > DocTErmOrds.uninvert:
> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405)
> >
> >  I'm guessing the actual peak is somewhere between 6 and 8 GB.
> >
> > BTW: is there some documentation somewhere that explains what the stats
> > output to INFO mean?
> >
> > Tom
> >
> >
> > java.lang.OutOfMemoryError: GC overhead limit exceeded > name="trace">java.lang.RuntimeException: java.lang.OutOfMemoryError: GC
> > overhead limit exceeded
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
> > at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
> > at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
> > at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
> > at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
> > at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548)
> > at
> >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> > at
> >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
> > at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
> > at
> >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
> > at
> >
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
> > at
> >
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
> > at
> >
> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
> > at
> >
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
> > at
> >
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
> > at java.lang.Thread.run(Thread.java:724)
> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405)
> > at
> org.apache.solr.request.UnInvertedField.(UnInvertedField.java:179)
> > at
> >
> org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:664)
> > at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:426)
> > at
> >
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517)
> > at
> >
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252)
> > at
> >
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)
> > at
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
> > at
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
> > ... 16 more
> > 
> >
> > ---
> > Nov 08, 2013 1:39:26 PM org.apache.solr.request.UnInvertedField 
> > INFO: UnInverted multi-valued field {field=topicStr,
> > memSize=1,768,10

[jira] [Commented] (SOLR-5416) CollapsingQParserPlugin bug with Tagging

2013-11-11 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819180#comment-13819180
 ] 

David commented on SOLR-5416:
-

Joel,

I'm currently using my patch and the facet counts are correct and the 
performance is good. We were looking to roll this out in production where I 
work. Would you advise against it? What kind of problems could this cause?

> CollapsingQParserPlugin bug with Tagging
> 
>
> Key: SOLR-5416
> URL: https://issues.apache.org/jira/browse/SOLR-5416
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.6
>Reporter: David
>Assignee: Joel Bernstein
>  Labels: group, grouping
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5416.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
> {code}
>  ModifiableSolrParams params = new ModifiableSolrParams();
> params.add("q", "*:*");
> params.add("fq", "{!collapse field=group_s}");
> params.add("defType", "edismax");
> params.add("bf", "field(test_ti)");
> params.add("fq","{!tag=test_ti}test_ti:5");
> params.add("facet","true");
> params.add("facet.field","{!ex=test_ti}test_ti");
> assertQ(req(params), "*[count(//doc)=1]", 
> "//doc[./int[@name='test_ti']='5']");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Areek Zillur (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Areek Zillur updated LUCENE-5337:
-

Attachment: LUCENE-5337.patch

Updated Patch:
  - minor changes to fix forbidden api checks and documentation lint

Thanks Erick and Robert for the review. I updated the patch so that it will 
pass the validations. I still have the diamond operators in the test code, let 
me know if there is anything I can do about that. 

> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch, LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5416) CollapsingQParserPlugin bug with Tagging

2013-11-11 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819224#comment-13819224
 ] 

Joel Bernstein commented on SOLR-5416:
--

David,

Here is the flow as I see it with the patch:
1) The intitial search executes and produces a result based on collapsing the 
groups by score.
2) The facet component needs to regenerate the docset because of the 
tag/exclude parameters. But the scorer is not present when regenerating the 
docset, so it is using logic that overwrites the group-head each time. This 
will result in the document that is found latest in the index becoming the 
group-head, for each group.

So, the result set used to calculate the facet, will be different from the 
result set used to generate the search results.

To keep these in sync you would need to have step 2, also collapse based on 
score.

> CollapsingQParserPlugin bug with Tagging
> 
>
> Key: SOLR-5416
> URL: https://issues.apache.org/jira/browse/SOLR-5416
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.6
>Reporter: David
>Assignee: Joel Bernstein
>  Labels: group, grouping
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5416.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
> {code}
>  ModifiableSolrParams params = new ModifiableSolrParams();
> params.add("q", "*:*");
> params.add("fq", "{!collapse field=group_s}");
> params.add("defType", "edismax");
> params.add("bf", "field(test_ti)");
> params.add("fq","{!tag=test_ti}test_ti:5");
> params.add("facet","true");
> params.add("facet.field","{!ex=test_ti}test_ti");
> assertQ(req(params), "*[count(//doc)=1]", 
> "//doc[./int[@name='test_ti']='5']");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5416) CollapsingQParserPlugin bug with Tagging

2013-11-11 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819237#comment-13819237
 ] 

David commented on SOLR-5416:
-

Oh I see that won't actually be a problem for me since all of the documents 
in the group should have the same facet counts. Thanks for the reply. I will 
wait for a fix. But for now, if I understand your reply correctly, I don't 
think that will affect my facet counts in a negative manner.

> CollapsingQParserPlugin bug with Tagging
> 
>
> Key: SOLR-5416
> URL: https://issues.apache.org/jira/browse/SOLR-5416
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.6
>Reporter: David
>Assignee: Joel Bernstein
>  Labels: group, grouping
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5416.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
> {code}
>  ModifiableSolrParams params = new ModifiableSolrParams();
> params.add("q", "*:*");
> params.add("fq", "{!collapse field=group_s}");
> params.add("defType", "edismax");
> params.add("bf", "field(test_ti)");
> params.add("fq","{!tag=test_ti}test_ti:5");
> params.add("facet","true");
> params.add("facet.field","{!ex=test_ti}test_ti");
> assertQ(req(params), "*[count(//doc)=1]", 
> "//doc[./int[@name='test_ti']='5']");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5416) CollapsingQParserPlugin bug with Tagging

2013-11-11 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819255#comment-13819255
 ] 

Joel Bernstein commented on SOLR-5416:
--

Yes, if all the documents in the same group have the same facet counts then you 
won't notice this problem.

> CollapsingQParserPlugin bug with Tagging
> 
>
> Key: SOLR-5416
> URL: https://issues.apache.org/jira/browse/SOLR-5416
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.6
>Reporter: David
>Assignee: Joel Bernstein
>  Labels: group, grouping
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-5416.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
> {code}
>  ModifiableSolrParams params = new ModifiableSolrParams();
> params.add("q", "*:*");
> params.add("fq", "{!collapse field=group_s}");
> params.add("defType", "edismax");
> params.add("bf", "field(test_ti)");
> params.add("fq","{!tag=test_ti}test_ti:5");
> params.add("facet","true");
> params.add("facet.field","{!ex=test_ti}test_ti");
> assertQ(req(params), "*[count(//doc)=1]", 
> "//doc[./int[@name='test_ti']='5']");
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.

2013-11-11 Thread Jack Conradson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819275#comment-13819275
 ] 

Jack Conradson commented on LUCENE-5336:


Thanks for the feedback.

To answer the malformed input question --

If 
"foo bar
is given as the query, the double quote will be dropped, and if whitespace is 
an operator it will make term queries for both 'foo' and 'bar' otherwise it 
will make a single term query 'foo bar'
If
foo"bar
is given as the query, the double quote will be dropped, and term queries will 
be made for both 'foo' and 'bar'

The reason it's done this way is because the parser only backtracks as far as 
the malformed input (in this case the extraneous double quote), so 'foo' would 
already be part of the query tree.  This is because only a single pass is made 
for each query.  The parser could be changed to do two passes to remove 
extraneous characters, but I believe that only makes the code more complex, and 
doesn't necessarily interpret the query any better for a user since the 
malformed character gives no hint as to what he/she really intended to do.

I will try to post another patch today or tomorrow.

I plan to do the following:
* Fix the Javadoc comment
* Add more tests for random operators
* Rename the class to SimpleQueryParser and rename the package to .simple

> Add a simple QueryParser to parse human-entered queries.
> 
>
> Key: LUCENE-5336
> URL: https://issues.apache.org/jira/browse/LUCENE-5336
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jack Conradson
> Attachments: LUCENE-5336.patch
>
>
> I would like to add a new simple QueryParser to Lucene that is designed to 
> parse human-entered queries.  This parser will operate on an entire entered 
> query using a specified single field or a set of weighted fields (using term 
> boost).
> All features/operations in this parser can be enabled or disabled depending 
> on what is necessary for the user.  A default operator may be specified as 
> either 'MUST' representing 'and' or 'SHOULD' representing 'or.'  The 
> features/operations that this parser will include are the following:
> * AND specified as '+'
> * OR specified as '|'
> * NOT specified as '-'
> * PHRASE surrounded by double quotes
> * PREFIX specified as '*'
> * PRECEDENCE surrounded by '(' and ')'
> * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default 
> operator to be used
> * ESCAPE specified as '\' will allow operators to be used in terms
> The key differences between this parser and other existing parsers will be 
> the following:
> * No exceptions will be thrown, and errors in syntax will be ignored.  The 
> parser will do a best-effort interpretation of any query entered.
> * It uses minimal syntax to express queries.  All available operators are 
> single characters or pairs of single characters.
> * The parser is hand-written and in a single Java file making it easy to 
> modify.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-11 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-5322:
---

Attachment: LUCENE-5322.validate-maven-artifacts.patch

Currently {{validate-maven-artifacts}} invokes {{filter-pom-templates}} once 
per POM, which is way too much; also, {{validate-maven-artifacts}} depends on 
{{generate-maven-artifacts}}, even though it only needs to filtered POMs, and 
not the built artifacts.  

This patch fixes both issues.

Committing shortly.

> Clean up / simplify Maven-related Ant targets
> -
>
> Key: LUCENE-5322
> URL: https://issues.apache.org/jira/browse/LUCENE-5322
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5322.patch, 
> LUCENE-5322.validate-maven-artifacts.patch
>
>
> Many Maven-related Ant targets are public when they don't need to be, e.g. 
> dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
> The arrangement of these targets could be simplified if the directories that 
> have public entry points were minimized.
> generate-maven-artifacts should be runnable from the top level and from 
> lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-11 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819327#comment-13819327
 ] 

Steve Rowe edited comment on LUCENE-5322 at 11/11/13 8:19 PM:
--

Currently {{validate-maven-artifacts}} invokes {{filter-pom-templates}} once 
per POM, which is way too much; also, {{validate-maven-artifacts}} depends on 
{{generate-maven-artifacts}}, even though it only needs the filtered POMs, and 
not the built artifacts.  

This patch fixes both issues.

Committing shortly.


was (Author: steve_rowe):
Currently {{validate-maven-artifacts}} invokes {{filter-pom-templates}} once 
per POM, which is way too much; also, {{validate-maven-artifacts}} depends on 
{{generate-maven-artifacts}}, even though it only needs to filtered POMs, and 
not the built artifacts.  

This patch fixes both issues.

Committing shortly.

> Clean up / simplify Maven-related Ant targets
> -
>
> Key: LUCENE-5322
> URL: https://issues.apache.org/jira/browse/LUCENE-5322
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5322.patch, 
> LUCENE-5322.validate-maven-artifacts.patch
>
>
> Many Maven-related Ant targets are public when they don't need to be, e.g. 
> dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
> The arrangement of these targets could be simplified if the directories that 
> have public entry points were minimized.
> generate-maven-artifacts should be runnable from the top level and from 
> lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819332#comment-13819332
 ] 

ASF subversion and git services commented on LUCENE-5322:
-

Commit 1540832 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1540832 ]

LUCENE-5322: make 'ant validate-maven-artifacts' run faster

> Clean up / simplify Maven-related Ant targets
> -
>
> Key: LUCENE-5322
> URL: https://issues.apache.org/jira/browse/LUCENE-5322
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5322.patch, 
> LUCENE-5322.validate-maven-artifacts.patch
>
>
> Many Maven-related Ant targets are public when they don't need to be, e.g. 
> dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
> The arrangement of these targets could be simplified if the directories that 
> have public entry points were minimized.
> generate-maven-artifacts should be runnable from the top level and from 
> lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5337) Add Payload support to FileDictionary (Suggest) and make it more configurable

2013-11-11 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-5337:
---

Attachment: LUCENE-5337.patch

Comment-only change so precommit succeeds (Javadoc).

So this cleanly precommits and tests on my machine. Unless there are 
objections, I'll commit this Wednesday or so, after 4.6 is tagged, I don't see 
a good reason to rush this into the 4.6 release.

> Add Payload support to FileDictionary (Suggest) and make it more configurable
> -
>
> Key: LUCENE-5337
> URL: https://issues.apache.org/jira/browse/LUCENE-5337
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Attachments: LUCENE-5337.patch, LUCENE-5337.patch, LUCENE-5337.patch
>
>
> It would be nice to add payload support to FileDictionary, so user can pass 
> in associated payload with suggestion entries. 
> Currently the FileDictionary has a hard-coded field-delimiter (TAB), it would 
> be nice to let the users configure the field delimiter as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1023: POMs out of sync

2013-11-11 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1023/

No tests ran.

Build Log:
[...truncated 1227 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1023: POMs out of sync

2013-11-11 Thread Steve Rowe
Looks like I jumped the gun in assuming for my latest LUCENE-5322 commit that 
‘ant validate-maven-artifacts’ doesn't need to depend on 
‘generate-maven-artifacts’ - this problem didn’t surface for me locally since I 
don’t remove Lucene/Solr artifacts from my local Maven repository before 
running the target.

I’ll put back the ‘generate-maven-artifacts’ dependency.

From the Jenkins log:
——
artifact:dependencies] Unable to locate resource in repository
[artifact:dependencies] [INFO] Unable to find resource 
'org.apache.lucene:lucene-core:jar:5.0-SNAPSHOT' in repository 
sonatype.releases (
http://oss.sonatype.org/content/repositories/releases
)
[artifact:dependencies] An error has occurred while processing the Maven 
artifact tasks.
[artifact:dependencies]  Diagnosis:
[artifact:dependencies] 
[artifact:dependencies] Unable to resolve artifact: Missing:
[artifact:dependencies] --
[artifact:dependencies] 1) org.apache.lucene:lucene-codecs:jar:5.0-SNAPSHOT
…
[artifact:dependencies] 2) org.apache.lucene:lucene-core:jar:5.0-SNAPSHOT
——



On Nov 11, 2013, at 3:38 PM, Apache Jenkins Server  
wrote:

> Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1023/
> 
> No tests ran.
> 
> Build Log:
> [...truncated 1227 lines...]
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819393#comment-13819393
 ] 

ASF subversion and git services commented on LUCENE-5322:
-

Commit 1540846 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1540846 ]

LUCENE-5322: 'ant validate-maven-artifacts' should depend on 
'generate-maven-artifacts'

> Clean up / simplify Maven-related Ant targets
> -
>
> Key: LUCENE-5322
> URL: https://issues.apache.org/jira/browse/LUCENE-5322
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5322.patch, 
> LUCENE-5322.validate-maven-artifacts.patch
>
>
> Many Maven-related Ant targets are public when they don't need to be, e.g. 
> dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
> The arrangement of these targets could be simplified if the directories that 
> have public entry points were minimized.
> generate-maven-artifacts should be runnable from the top level and from 
> lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819424#comment-13819424
 ] 

ASF subversion and git services commented on LUCENE-5322:
-

Commit 1540849 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1540849 ]

LUCENE-5322: 'ant validate-maven-dependencies' doesn't need to call 
'filter-pom-templates' directly, since 'generate-maven-artifacts' already does 
it

> Clean up / simplify Maven-related Ant targets
> -
>
> Key: LUCENE-5322
> URL: https://issues.apache.org/jira/browse/LUCENE-5322
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>Priority: Minor
> Fix For: 4.6, 5.0
>
> Attachments: LUCENE-5322.patch, 
> LUCENE-5322.validate-maven-artifacts.patch
>
>
> Many Maven-related Ant targets are public when they don't need to be, e.g. 
> dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
> The arrangement of these targets could be simplified if the directories that 
> have public entry points were minimized.
> generate-maven-artifacts should be runnable from the top level and from 
> lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.

2013-11-11 Thread Jack Conradson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Conradson updated LUCENE-5336:
---

Attachment: LUCENE-5336.patch

Attached an updated version of the patch with the three modifications from my 
previous comment.

> Add a simple QueryParser to parse human-entered queries.
> 
>
> Key: LUCENE-5336
> URL: https://issues.apache.org/jira/browse/LUCENE-5336
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jack Conradson
> Attachments: LUCENE-5336.patch, LUCENE-5336.patch
>
>
> I would like to add a new simple QueryParser to Lucene that is designed to 
> parse human-entered queries.  This parser will operate on an entire entered 
> query using a specified single field or a set of weighted fields (using term 
> boost).
> All features/operations in this parser can be enabled or disabled depending 
> on what is necessary for the user.  A default operator may be specified as 
> either 'MUST' representing 'and' or 'SHOULD' representing 'or.'  The 
> features/operations that this parser will include are the following:
> * AND specified as '+'
> * OR specified as '|'
> * NOT specified as '-'
> * PHRASE surrounded by double quotes
> * PREFIX specified as '*'
> * PRECEDENCE surrounded by '(' and ')'
> * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default 
> operator to be used
> * ESCAPE specified as '\' will allow operators to be used in terms
> The key differences between this parser and other existing parsers will be 
> the following:
> * No exceptions will be thrown, and errors in syntax will be ignored.  The 
> parser will do a best-effort interpretation of any query entered.
> * It uses minimal syntax to express queries.  All available operators are 
> single characters or pairs of single characters.
> * The parser is hand-written and in a single Java file making it easy to 
> modify.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-5408:


Assignee: Joel Bernstein

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819524#comment-13819524
 ] 

Joel Bernstein commented on SOLR-5408:
--

I was able to reproduce and am investigating what the issue is.

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5408:
-

Attachment: SOLR-5408.patch

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819545#comment-13819545
 ] 

Joel Bernstein commented on SOLR-5408:
--

I'll add a test case for this as well going forward.

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819543#comment-13819543
 ] 

Joel Bernstein commented on SOLR-5408:
--

Brandon,

I believe this patch should resolve the issue. It was created on branch_4x. If 
it doesn't apply to your build, let me know and I'll create a patch for the 
version you're working with.

The problem was that the scorer needed to be set on the delegate collecter 
after each segment reader was set. The initial code was setting the scorer on 
the delegate collector only once, which worked fine for single sort critera. 

Joel



> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3397) Insure that Replication and Solr Cloud are compatible

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819589#comment-13819589
 ] 

ASF subversion and git services commented on SOLR-3397:
---

Commit 1540881 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1540881 ]

SOLR-3397: Insure that replication and SolrCloud are compatible. Actually, just 
log a warning if SolrCloud is detected and master or slave is configured in 
solrconfig.xml

> Insure that Replication and Solr Cloud are compatible
> -
>
> Key: SOLR-3397
> URL: https://issues.apache.org/jira/browse/SOLR-3397
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>
> There has been at least one report of an early-adopter having replication (as 
> in master/slave) configured with SolrCloud and having very odd results. 
> Experienced Solr users could reasonably try this (or just have their 
> configurations from 3.x Solr installations hanging around). Since SolrCloud 
> takes this functionality over completely, it seems like replication needs to 
> be made smart enough to disable itself if running under SolrCloud.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3397) Insure that Replication and Solr Cloud are compatible

2013-11-11 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-3397:
-

Attachment: SOLR-3397.patch

Patch that logs a warning if master or slave is configured and a zkController 
is detected.

> Insure that Replication and Solr Cloud are compatible
> -
>
> Key: SOLR-3397
> URL: https://issues.apache.org/jira/browse/SOLR-3397
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-3397.patch
>
>
> There has been at least one report of an early-adopter having replication (as 
> in master/slave) configured with SolrCloud and having very odd results. 
> Experienced Solr users could reasonably try this (or just have their 
> configurations from 3.x Solr installations hanging around). Since SolrCloud 
> takes this functionality over completely, it seems like replication needs to 
> be made smart enough to disable itself if running under SolrCloud.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819677#comment-13819677
 ] 

ASF subversion and git services commented on SOLR-5408:
---

Commit 1540904 from [~joel.bernstein] in branch 'dev/trunk'
[ https://svn.apache.org/r1540904 ]

SOLR-5408 Fix CollapsingQParserPlugin issue with compound sort criteria

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-5408:
---

Attachment: SOLR-5408.patch

Here's a test case with Joel's fix merged in too.

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch, SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5408) Collapsing Query Parser does not respect multiple Sort fields

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819746#comment-13819746
 ] 

ASF subversion and git services commented on SOLR-5408:
---

Commit 1540922 from [~joel.bernstein] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1540922 ]

SOLR-5408 Fix CollapsingQParserPlugin issue with compound sort criteria

> Collapsing Query Parser does not respect multiple Sort fields
> -
>
> Key: SOLR-5408
> URL: https://issues.apache.org/jira/browse/SOLR-5408
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.5
>Reporter: Brandon Chapman
>Assignee: Joel Bernstein
>Priority: Critical
> Attachments: SOLR-5408.patch, SOLR-5408.patch
>
>
> When using the collapsing query parser, only the last sort field appears to 
> be used.
> http://172.18.0.10:8080/solr/product/select_eng?sort=score%20desc,name_sort_eng%20desc&qf=name_eng^3+brand^2+categories_term_eng+sku+upc+promoTag+model+related_terms_eng&pf2=name_eng^2&defType=edismax&rows=12&pf=name_eng~5^3&start=0&q=ipad&boost=sqrt(popularity)&qt=/select_eng&fq=productType:MERCHANDISE&fq=merchant:bestbuycanada&fq=(*:*+AND+-all_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(all_all_suppressed_b_ovly:false+AND+-rbc_all_suppressed_b_ovly:[*+TO+*]+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_all_suppressed_b_ovly:false+AND+-rbc_cpx_suppressed_b_ovly:[*+TO+*])+OR+(rbc_cpx_suppressed_b_ovly:false)&fq=translations:eng&fl=psid,name_eng,score&debug=true&debugQuery=true&fq={!collapse+field%3DgroupId+nullPolicy%3Dexpand}
> 
> 
> 3002010250210
> 
> ZOTAC ZBOX nano XS AD13 Plus All-In-One PC (AMD E2-1800/2GB RAM/64GB SSD)
> 
> 0.41423172
> 
> The same query without using the collapsing query parser produces the 
> expected result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b114) - Build # 8180 - Failure!

2013-11-11 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/8180/
Java: 32bit/jdk1.8.0-ea-b114 -server -XX:+UseSerialGC

1 tests failed.
REGRESSION:  org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT

Error Message:
expected:<3> but was:<2>

Stack Trace:
java.lang.AssertionError: expected:<3> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([C99468FACBB041CE:7C12097D7471F33A]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.core.TestNonNRTOpen.assertNotNRT(TestNonNRTOpen.java:133)
at 
org.apache.solr.core.TestNonNRTOpen.testReaderIsNotNRT(TestNonNRTOpen.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementR

[jira] [Resolved] (SOLR-3397) Insure that Replication and Solr Cloud are compatible

2013-11-11 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-3397.
--

   Resolution: Fixed
Fix Version/s: 5.0
   4.6

> Insure that Replication and Solr Cloud are compatible
> -
>
> Key: SOLR-3397
> URL: https://issues.apache.org/jira/browse/SOLR-3397
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-3397.patch
>
>
> There has been at least one report of an early-adopter having replication (as 
> in master/slave) configured with SolrCloud and having very odd results. 
> Experienced Solr users could reasonably try this (or just have their 
> configurations from 3.x Solr installations hanging around). Since SolrCloud 
> takes this functionality over completely, it seems like replication needs to 
> be made smart enough to disable itself if running under SolrCloud.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3397) Insure that Replication and Solr Cloud are compatible

2013-11-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819769#comment-13819769
 ] 

ASF subversion and git services commented on SOLR-3397:
---

Commit 1540930 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1540930 ]

SOLR-3397: Insure that replication and SolrCloud are compatible. Actually, just 
log a warning if SolrCloud is detected and master or slave is configured in 
solrconfig.xml

> Insure that Replication and Solr Cloud are compatible
> -
>
> Key: SOLR-3397
> URL: https://issues.apache.org/jira/browse/SOLR-3397
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Fix For: 4.6, 5.0
>
> Attachments: SOLR-3397.patch
>
>
> There has been at least one report of an early-adopter having replication (as 
> in master/slave) configured with SolrCloud and having very odd results. 
> Experienced Solr users could reasonably try this (or just have their 
> configurations from 3.x Solr installations hanging around). Since SolrCloud 
> takes this functionality over completely, it seems like replication needs to 
> be made smart enough to disable itself if running under SolrCloud.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5212) java 7u40 causes sigsegv and corrupt term vectors

2013-11-11 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5212.
-

Resolution: Done

The fix is committed to openjdk trunk.

> java 7u40 causes sigsegv and corrupt term vectors
> -
>
> Key: LUCENE-5212
> URL: https://issues.apache.org/jira/browse/LUCENE-5212
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: crashFaster.patch, crashFaster2.0.patch, 
> hs_err_pid32714.log, jenkins.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2013-11-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819791#comment-13819791
 ] 

Robert Muir commented on LUCENE-2899:
-

Hi Markus: I haven't looked at this patch. I'll review it now and give my 
thoughts.

> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.6
>
> Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, 
> OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2013-11-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819813#comment-13819813
 ] 

Robert Muir commented on LUCENE-2899:
-

Just some thoughts:

I think it would be best to split out the different functionality here into 
subtasks for each piece, and figure out how each should best be integrated.

The current patch does strange things to try to deal with some impedence 
mismatch due to the design here, such as the tokenfilter which consumes the 
entire analysis chain and then replays the whole thing back with POS or NER as 
payloads. Is it really necessary to give this thing more scope than a single 
setnence? typically such tagging models (at least the ones ive worked with) 
tend to be trained only within sentence scope. 

Also payloads should not be used internally, instead things like TypeAttribute 
should be used for POSTags, if someone wants to filter out certain POS or 
maintain certain POS they can use already existing stuff like TypeTokenFilter, 
if they want to index Type as a payload, they can use TypeAsPayloadTokenFilter, 
and so on.

While I can see this POS-tagging being useful inside the analysis chain: the 
NER case is much less clear, I think its more important to e.g. be integrated 
outside of the analysis chain so that named entities/mentions can be faceted 
on, added to separate fields for search (likely with a different analysis chain 
for that), etc. So for lucene that would be an easier way to add these as 
facets, for solr it probably makes more sense as UpdateProcessor than as 
analysis chain.

Finally: I'm confused as to what benefit we get from using OpenNLP directly, 
versus integrating with it via opennlp-uima? Our UIMA integration at various 
levels (analysis chain/update processor) is already there, so I'm just 
wondering if thats a much shorter way path.


> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.6
>
> Attachments: LUCENE-2899-RJN.patch, LUCENE-2899.patch, 
> OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Estimating peak memory use for UnInvertedField faceting

2013-11-11 Thread Otis Gospodnetic
Hi Tom,

I believe Solr will automatically use DocValues for faceting if you've
defined them in the schema.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Nov 11, 2013 at 11:33 AM, Tom Burton-West  wrote:
> Thanks Otis,
>
>  I'm looking forward to the presentation videos.
>
> I'll look into using DocValues.Re-indexing 200 million docs will take a
> while though :).
> Will Solr automatically use DocValues for faceting if you have DocValues for
> the field or is there some configuration or parameter that needs to be set?
>
> Tom
>
>
> On Sat, Nov 9, 2013 at 9:57 AM, Otis Gospodnetic
>  wrote:
>>
>> Hi Tom,
>>
>> Check http://blog.sematext.com/2013/11/09/presentation-solr-for-analytics/
>> .  It includes info about our experiment with DocValues, which clearly
>> shows lower heap usage, which means you'll get further without getting
>> this OOM.  In our experiments we didn't sort, facet, or group, and I
>> see you are faceting, which means that DocValues, which are more
>> efficient than FieldCache, should help you even more than it helped
>> us.
>>
>> The graphs are from SPM, which you could use to monitor your Solr
>> cluster, at least while you are tuning it.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>> On Fri, Nov 8, 2013 at 2:41 PM, Tom Burton-West 
>> wrote:
>> > Hi Yonik,
>> >
>> > I don't know enough about JVM tuning and monitoring to do this in a
>> > clean
>> > way, so I just tried setting the max heap at 8GB and then 6GB to force
>> > garbage collection.  With it set to 6GB it goes into  a long GC loop and
>> > then runs out of heap (See below) .  The stack trace says the issue is
>> > with
>> > DocTErmOrds.uninvert:
>> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405)
>> >
>> >  I'm guessing the actual peak is somewhere between 6 and 8 GB.
>> >
>> > BTW: is there some documentation somewhere that explains what the stats
>> > output to INFO mean?
>> >
>> > Tom
>> >
>> >
>> > java.lang.OutOfMemoryError: GC overhead limit exceeded> > name="trace">java.lang.RuntimeException: java.lang.OutOfMemoryError: GC
>> > overhead limit exceeded
>> > at
>> >
>> > org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653)
>> > at
>> >
>> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366)
>> > at
>> >
>> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
>> > at
>> >
>> > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
>> > at
>> >
>> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
>> > at
>> >
>> > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
>> > at
>> >
>> > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
>> > at
>> > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548)
>> > at
>> >
>> > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>> > at
>> >
>> > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
>> > at
>> >
>> > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
>> > at
>> >
>> > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
>> > at
>> >
>> > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
>> > at
>> >
>> > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
>> > at
>> >
>> > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
>> > at
>> >
>> > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
>> > at
>> >
>> > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
>> > at java.lang.Thread.run(Thread.java:724)
>> > Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>> > at org.apache.lucene.index.DocTermOrds.uninvert(DocTermOrds.java:405)
>> > at
>> > org.apache.solr.request.UnInvertedField.(UnInvertedField.java:179)
>> > at
>> >
>> > org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField.java:664)
>> > at
>> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:426)
>> > at
>> >
>> > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517)
>> > at
>> >
>> > org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252)
>> > at
>> >
>> > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)
>> > at
>> >
>> > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
>> > at
>> >
>> > org.apache.solr.handler.Reques

[jira] [Updated] (SOLR-5399) Improve DebugComponent for distributed requests

2013-11-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-5399:


Attachment: SOLR-5399.patch

Added some unit tests. Also, I'm including for now the complete shard response 
in the track section.

> Improve DebugComponent for distributed requests
> ---
>
> Key: SOLR-5399
> URL: https://issues.apache.org/jira/browse/SOLR-5399
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0
>Reporter: Tomás Fernández Löbbe
> Attachments: SOLR-5399.patch, SOLR-5399.patch
>
>
> I'm working on extending the DebugComponent for adding some useful 
> information to be able to track distributed requests better. I'm adding two 
> different things, first, the request can generate a "request ID" that will be 
> printed in the logs for the main query and all the different internal 
> requests to the different shards. This should make it easier to find the 
> different parts of a single user request in the logs. It would also add the 
> "purpose" of each internal request to the logs, like: 
> RequestPurpose=GET_FIELDS,GET_DEBUG or RequestPurpose=GET_TOP_IDS. 
> Also, I'm adding a "track" section to the debug info where to add information 
> about the different phases of the distributed request (right now, I'm only 
> including QTime, but could eventually include more information) like: 
> {code:xml}
> 
> 
> 
> QTime: 10
> QTime: 25
> 
> 
> QTime: 1
> 
> 
> 
> {code}
> To get this, debugQuery must be set to true, or debug must include 
> "debug=track". This information is only added to distributed requests.  I 
> would like to get feedback on this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org