date:20121210

Re: Question on -Dtests.iters needing the glob

2012-12-10 Thread Shai Erera

Thanks Dawid. I read LUCENE-4463, I'm glad that others want this feature
too, and I hope that someone will find a solution.

I'll continue to use -Dtests.iters=N and tests.method=X*, in a single JVM.

At the moment, I don't think that -Dtests.dups is very useful, not to me at
least, as:
1) I often use 'iters' to try and find a seed that breaks my changes
2) If the failure in question is of a concurrent test, tests.dups isn't
very efficient

This is a great framework. Parallelizing w/ different seeds will make it
even better !

Shai

On Sun, Dec 9, 2012 at 3:52 PM, Dawid Weiss wrote:

> > So now I'm even more confused. What is tests.dups good for then? Just
> running the same suite multiple times, but not changing the seed?
>
> Yes. This was originally requested by Mark (I believe) -- we had a
> non-deterministic test for which many repetitions were needed. Because
> tests.iters cannot be parallelised (they're a sequence of tests under
> a single suite) we duplicated the entire suites -- these are
> independent and could be distributed across forked jvms.
>
> > Dawid, can tests.dups use different seeds? Then it'd be really useful
> cause it can run these iters in parallel.
>
> No, not at the moment. I do realize it's a pain and I have it in mind
> that it'd be great to have such a possibility but it's conflicting
> with the architecture of the runner right now. I don't have a clear
> vision how to solve this but I know it's not going to be trivial.
>
> There is an issue, check out the discussion:
> https://issues.apache.org/jira/browse/LUCENE-4463
>
> in particular this one:
>
> https://issues.apache.org/jira/browse/LUCENE-4463?focusedCommentId=13470937&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13470937
>
> > -Dtests.iters generate different seeds for each suite/run. It's good if
> you
> > have a random test and want to test its (or the code's) robustness.
>
> tests.iters just multiplies every test N times. Whether it gets a
> constant seed or a derivative seed depends on annotations and/or
> -Dtests.seed override. By default what you say is true but try running
> your tests with:
>
> -Dtests.iters=10 -Dtests.seed=deadbeef:deadf00d
>
> and you'll see what I mean.
>
> > -Dtests.dups uses the same seed for every suite, but can run in parallel
> JVMs.
>
> tests.dups is essentially feeding the same class name to the runner
> twice. There is one master seed so both classes should run
> IDENTICALLY. You can speed up stress testing if you have multiple
> cores or if you wish to run the same test over and over (even with a
> single JVM) but it's going to be the same test (unless it's
> non-deterministic or in other way does not depend on the seed).
>
> > But I'm sure Dawid has thought about it, and there's some JUnit
> limitation
> > that does not allow it :).
>
> This time it's not even JUnit but the code I wrote -- it strictly
> follows the principle of a single master seed, it worked great for
> other things but I didn't think of repeating the same suite many
> times, each time with a different seed and it's hard to integrate it
> right now.
>
> I'm sure there's a way to rewrite the code, I just didn't have the
> time to look into it -- lots of stuff piling up, sorry.
>
> Dawid
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (LUCENE-4601) ivy availability check isn't quite right

2012-12-10 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527763#comment-13527763
 ] 

Dawid Weiss commented on LUCENE-4601:
-

Nice, didn't know about it either.

> ivy availability check isn't quite right
> 
>
> Key: LUCENE-4601
> URL: https://issues.apache.org/jira/browse/LUCENE-4601
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Attachments: LUCENE-4601.patch
>
>
> remove ivy from your .ant/lib but load it up on a build file like so:
> You have to lie to lucene's build, overriding ivy.available, because for some 
> reason the detection is wrong and will tell you ivy is not available, when it 
> actually is.
> I tried changing the detector to use available classname=some.ivy.class and 
> this didnt work either... so I don't actually know what the correct fix is.
> {noformat}
> 
>   
> 
>   
>uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path" />
>   
>  failonerror="true">
>   
>   
>   
> 
>   
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4345) Create a Classification module

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527765#comment-13527765
 ] 

Commit Tag Bot commented on LUCENE-4345:


[trunk commit] Tommaso Teofili
http://svn.apache.org/viewvc?view=revision&revision=1419258

[LUCENE-4345] - improved DS performance by doing commits only once


> Create a Classification module
> --
>
> Key: LUCENE-4345
> URL: https://issues.apache.org/jira/browse/LUCENE-4345
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>Priority: Minor
> Attachments: LUCENE-4345_2.patch, LUCENE-4345.patch, 
> SOLR-3700_2.patch, SOLR-3700.patch
>
>
> Lucene/Solr can host huge sets of documents containing lots of information in 
> fields so that these can be used as training examples (w/ features) in order 
> to very quickly create classifiers algorithms to use on new documents and / 
> or to provide an additional service.
> So the idea is to create a contrib module (called 'classification') to host a 
> ClassificationComponent that will use already seen data (the indexed 
> documents / fields) to classify new documents / text fragments.
> The first version will contain a (simplistic) Lucene based Naive Bayes 
> classifier but more implementations should be added in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4598:
---

Summary: Change PayloadIterator to not use top-level reader API  (was: 
Facet aggregation should work per segment)

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4603) The test framework should report forked JVM PIDs upon heartbeats

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527781#comment-13527781
 ] 

Commit Tag Bot commented on LUCENE-4603:


[trunk commit] Dawid Weiss
http://svn.apache.org/viewvc?view=revision&revision=1419261

LUCENE-4603: Upgrade randomized testing to version 2.0.5: print forked JVM PIDs 
on heartbeat from hung tests (Dawid Weiss)




> The test framework should report forked JVM PIDs upon heartbeats
> 
>
> Key: LUCENE-4603
> URL: https://issues.apache.org/jira/browse/LUCENE-4603
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 4.1, 5.0
>
>
> This would help in getting a stack trace of a hung JVM before the timeout 
> and/or in killing the offending JVM.
> RR issue:
> https://github.com/carrotsearch/randomizedtesting/issues/135

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4603) The test framework should report forked JVM PIDs upon heartbeats

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527786#comment-13527786
 ] 

Commit Tag Bot commented on LUCENE-4603:


[branch_4x commit] Dawid Weiss
http://svn.apache.org/viewvc?view=revision&revision=1419263

LUCENE-4603: Upgrade randomized testing to version 2.0.5: print forked JVM PIDs 
on heartbeat from hung tests (Dawid Weiss)





> The test framework should report forked JVM PIDs upon heartbeats
> 
>
> Key: LUCENE-4603
> URL: https://issues.apache.org/jira/browse/LUCENE-4603
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 4.1, 5.0
>
>
> This would help in getting a stack trace of a hung JVM before the timeout 
> and/or in killing the offending JVM.
> RR issue:
> https://github.com/carrotsearch/randomizedtesting/issues/135

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4598:
---

Attachment: LUCENE-4598.patch

I've decided to not do two commits. So attached patch covers the previous one + 
moves to using per-segment posting iteration.

I'd appreciate if someone can review the changes, especially 
PayloadIterator.nextSegment and .setdoc().

I ran tests w/ many iters, all seem ok so far.

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4603) The test framework should report forked JVM PIDs upon heartbeats

2012-12-10 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-4603.
-

Resolution: Fixed

> The test framework should report forked JVM PIDs upon heartbeats
> 
>
> Key: LUCENE-4603
> URL: https://issues.apache.org/jira/browse/LUCENE-4603
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 4.1, 5.0
>
>
> This would help in getting a stack trace of a hung JVM before the timeout 
> and/or in killing the offending JVM.
> RR issue:
> https://github.com/carrotsearch/randomizedtesting/issues/135

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4606) The test framework should report forked JVM PIDs at the start of test logs

2012-12-10 Thread Dawid Weiss (JIRA)

Dawid Weiss created LUCENE-4606:
---

 Summary: The test framework should report forked JVM PIDs at the 
start of test logs
 Key: LUCENE-4606
 URL: https://issues.apache.org/jira/browse/LUCENE-4606
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.1, 5.0


A follow-up to LUCENE-4603

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-4.x-java7 - Build # 780 - Failure

2012-12-10 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-java7/780/

All tests passed

Build Log:
[...truncated 24344 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/build.xml:294:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/extra-targets.xml:117:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* lucene/licenses/junit4-ant-2.0.5.jar.sha1
* lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
* solr/licenses/junit4-ant-2.0.5.jar.sha1
* solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1

Total time: 31 minutes 13 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_09) - Build # 3152 - Failure!

2012-12-10 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/3152/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 29821 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* lucene/licenses/junit4-ant-2.0.5.jar.sha1
* lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
* solr/licenses/junit4-ant-2.0.5.jar.sha1
* solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1

Total time: 32 minutes 53 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-Tests-4.x-java7 - Build # 780 - Failure

2012-12-10 Thread Dawid Weiss

My bad, corrected.

Dawid

On Mon, Dec 10, 2012 at 10:28 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-java7/780/
>
> All tests passed
>
> Build Log:
> [...truncated 24344 lines...]
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/build.xml:294:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/extra-targets.xml:117:
>  The following files are missing svn:eol-style (or binary svn:mime-type):
> * lucene/licenses/junit4-ant-2.0.5.jar.sha1
> * lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
> * solr/licenses/junit4-ant-2.0.5.jar.sha1
> * solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1
>
> Total time: 31 minutes 13 seconds
> Build step 'Invoke Ant' marked build as failure
> Archiving artifacts
> Recording test results
> Email was triggered for: Failure
> Sending email for trigger: Failure
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-Java6 - Build # 15683 - Failure

2012-12-10 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java6/15683/

All tests passed

Build Log:
[...truncated 23954 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java6/build.xml:294:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java6/extra-targets.xml:117:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* lucene/licenses/junit4-ant-2.0.5.jar.sha1
* lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
* solr/licenses/junit4-ant-2.0.5.jar.sha1
* solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1

Total time: 22 minutes 55 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (LUCENE-4590) WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file

2012-12-10 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen reopened LUCENE-4590:
-

Lucene Fields:   (was: New)

Reopen issue for making the categories file name method public: 
categoriesLineFile() so that it can easily be modified in the future without 
breaking apps logic.

> WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file
> ---
>
> Key: LUCENE-4590
> URL: https://issues.apache.org/jira/browse/LUCENE-4590
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/benchmark
>Reporter: Doron Cohen
>Assignee: Doron Cohen
>Priority: Minor
> Attachments: LUCENE-4590.patch
>
>
> It may be convenient to split Wikipedia's line file into two separate files: 
> category-pages and non-category ones. 
> It is possible to split the original line file with grep or such.
> It is more efficient to do it in advance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527826#comment-13527826
 ] 

Adrien Grand commented on LUCENE-4598:
--

Nice that PayloadIterator now returns a direct reference to payloads instead of 
copying data!

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.6.0_37) - Build # 3143 - Still Failing!

2012-12-10 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/3143/
Java: 64bit/jdk1.6.0_37 -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 28873 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:294: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:117: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* lucene/licenses/junit4-ant-2.0.5.jar.sha1
* lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
* solr/licenses/junit4-ant-2.0.5.jar.sha1
* solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1

Total time: 33 minutes 47 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0_37 -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527831#comment-13527831
 ] 

Shai Erera commented on LUCENE-4598:


Yes ... well in the past TermPositions didn't maintain a byte[] for payloads, 
you had to give it one. Now that it does, it's stupid to copy the array again :)

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4590) WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527833#comment-13527833
 ] 

Commit Tag Bot commented on LUCENE-4590:


[trunk commit] Doron Cohen
http://svn.apache.org/viewvc?view=revision&revision=1419317

LUCENE-4590: WriteEnwikiLineDoc "trailing change": make 
categoriesLineFile(File) public.



> WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file
> ---
>
> Key: LUCENE-4590
> URL: https://issues.apache.org/jira/browse/LUCENE-4590
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/benchmark
>Reporter: Doron Cohen
>Assignee: Doron Cohen
>Priority: Minor
> Attachments: LUCENE-4590.patch
>
>
> It may be convenient to split Wikipedia's line file into two separate files: 
> category-pages and non-category ones. 
> It is possible to split the original line file with grep or such.
> It is more efficient to do it in advance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4590) WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527856#comment-13527856
 ] 

Commit Tag Bot commented on LUCENE-4590:


[branch_4x commit] Doron Cohen
http://svn.apache.org/viewvc?view=revision&revision=1419323

LUCENE-4590: WriteEnwikiLineDoc "trailing change": make 
categoriesLineFile(File) public.



> WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file
> ---
>
> Key: LUCENE-4590
> URL: https://issues.apache.org/jira/browse/LUCENE-4590
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/benchmark
>Reporter: Doron Cohen
>Assignee: Doron Cohen
>Priority: Minor
> Attachments: LUCENE-4590.patch
>
>
> It may be convenient to split Wikipedia's line file into two separate files: 
> category-pages and non-category ones. 
> It is possible to split the original line file with grep or such.
> It is more efficient to do it in advance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.7.0_09) - Build # 2116 - Failure!

2012-12-10 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/2116/
Java: 64bit/jdk1.7.0_09 -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 29650 lines...]
BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: 
The following files are missing svn:eol-style (or binary svn:mime-type):
* lucene/licenses/junit4-ant-2.0.5.jar.sha1
* lucene/licenses/randomizedtesting-runner-2.0.5.jar.sha1
* solr/licenses/junit4-ant-2.0.5.jar.sha1
* solr/licenses/randomizedtesting-runner-2.0.5.jar.sha1

Total time: 62 minutes 58 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.7.0_09 -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4590) WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file

2012-12-10 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-4590.
-

Resolution: Fixed

done.

> WriteEnwikiLineDoc which writes Wikipedia category pages to a separate file
> ---
>
> Key: LUCENE-4590
> URL: https://issues.apache.org/jira/browse/LUCENE-4590
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/benchmark
>Reporter: Doron Cohen
>Assignee: Doron Cohen
>Priority: Minor
> Attachments: LUCENE-4590.patch
>
>
> It may be convenient to split Wikipedia's line file into two separate files: 
> category-pages and non-category ones. 
> It is possible to split the original line file with grep or such.
> It is more efficient to do it in advance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4591) Make StoredFieldsFormat more configurable

2012-12-10 Thread Renaud Delbru (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527894#comment-13527894
 ] 

Renaud Delbru commented on LUCENE-4591:
---

That is fine for me. Thanks for your help Adrien.

> Make StoredFieldsFormat more configurable
> -
>
> Key: LUCENE-4591
> URL: https://issues.apache.org/jira/browse/LUCENE-4591
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 4.1
>Reporter: Renaud Delbru
> Fix For: 4.1
>
> Attachments: LUCENE-4591.patch, LUCENE-4591.patch, LUCENE-4591.patch, 
> PerFieldStoredFieldsFormat.java, PerFieldStoredFieldsReader.java, 
> PerFieldStoredFieldsWriter.java
>
>
> The current StoredFieldsFormat are implemented with the assumption that only 
> one type of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, 
> similarly to the PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of 
> StoredFieldsFormat#fieldsReader and StoredFieldsFormat#fieldsWriter so that 
> they use SegmentReadState and SegmentWriteState instead of the current set of 
> parameters.
> Let us know what you think about this idea. If this is of interest, we can 
> contribute with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4160) eDismax should not split search terms between letters and digits

2012-12-10 Thread Leonhard Maylein (JIRA)

Leonhard Maylein created SOLR-4160:
--

 Summary: eDismax should not split search terms between letters and 
digits
 Key: SOLR-4160
 URL: https://issues.apache.org/jira/browse/SOLR-4160
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.0
Reporter: Leonhard Maylein


The eDismax handler parses the query
is:038729080x into
+((is:038729080 is:x)~2)

The query parser should not separate camel
case words or mixtures of letters and digits.
This is the job of the analyzers.

Otherwise there are special types of data
(like isbn or issn numbers) which could not be
searched via the eDismax query parser.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4601) ivy availability check isn't quite right

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527899#comment-13527899
 ] 

Commit Tag Bot commented on LUCENE-4601:


[trunk commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419366

LUCENE-4601: fix ivy availability check to use typefound


> ivy availability check isn't quite right
> 
>
> Key: LUCENE-4601
> URL: https://issues.apache.org/jira/browse/LUCENE-4601
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Attachments: LUCENE-4601.patch
>
>
> remove ivy from your .ant/lib but load it up on a build file like so:
> You have to lie to lucene's build, overriding ivy.available, because for some 
> reason the detection is wrong and will tell you ivy is not available, when it 
> actually is.
> I tried changing the detector to use available classname=some.ivy.class and 
> this didnt work either... so I don't actually know what the correct fix is.
> {noformat}
> 
>   
> 
>   
>uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path" />
>   
>  failonerror="true">
>   
>   
>   
> 
>   
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4601) ivy availability check isn't quite right

2012-12-10 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4601.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.1

Thanks Ryan!

> ivy availability check isn't quite right
> 
>
> Key: LUCENE-4601
> URL: https://issues.apache.org/jira/browse/LUCENE-4601
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4601.patch
>
>
> remove ivy from your .ant/lib but load it up on a build file like so:
> You have to lie to lucene's build, overriding ivy.available, because for some 
> reason the detection is wrong and will tell you ivy is not available, when it 
> actually is.
> I tried changing the detector to use available classname=some.ivy.class and 
> this didnt work either... so I don't actually know what the correct fix is.
> {noformat}
> 
>   
> 
>   
>uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path" />
>   
>  failonerror="true">
>   
>   
>   
> 
>   
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527905#comment-13527905
 ] 

Michael McCandless commented on LUCENE-4598:


+1, looks great!

And it looks like it's a bit faster than before:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
 LowTerm   28.35  (1.4%)   29.42  (0.8%)
3.8% (   1% -6%)
HighTerm2.46  (0.6%)2.57  (0.5%)
4.8% (   3% -5%)
 MedTerm   13.09  (1.4%)   13.92  (0.5%)
6.4% (   4% -8%)
{noformat}

I think we could speed things up more if this code "owned" the iteration, eg 
with some sort of "bulk accumulate" method, rather than 
StandardFacetAccumulator going through CategoryListIterator down to 
PayloadIterator, per hit. This way it could first iterate by segment (on the 
outer loop), then, inside iterate on all docs in that segment, etc.  But save 
that for another day ...

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4601) ivy availability check isn't quite right

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527903#comment-13527903
 ] 

Commit Tag Bot commented on LUCENE-4601:


[branch_4x commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419368

LUCENE-4601: fix ivy availability check to use typefound


> ivy availability check isn't quite right
> 
>
> Key: LUCENE-4601
> URL: https://issues.apache.org/jira/browse/LUCENE-4601
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4601.patch
>
>
> remove ivy from your .ant/lib but load it up on a build file like so:
> You have to lie to lucene's build, overriding ivy.available, because for some 
> reason the detection is wrong and will tell you ivy is not available, when it 
> actually is.
> I tried changing the detector to use available classname=some.ivy.class and 
> this didnt work either... so I don't actually know what the correct fix is.
> {noformat}
> 
>   
> 
>   
>uri="antlib:org.apache.ivy.ant" classpathref="ivy.lib.path" />
>   
>  failonerror="true">
>   
>   
>   
> 
>   
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2012-12-10 Thread Shahar Davidson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527907#comment-13527907
 ] 

Shahar Davidson commented on SOLR-2894:
---

Just thought I'd add some feedback on this valuable patch.

I run some tests with the latest patch (Nov. 12) and the limit-per-field 
feature seems to be working alright. (Nice job Chris!)

I did, however, encounter 2 other issues (which I guess are related):
(1) There's no default sorting method. i.e. if no facet.sort is specified then 
results are not sorted. (this is a deviation from the current non-distributed 
pivot faceting behavior)
(2) Sorting per-field does not work. (i.e. f..facet.sort= does 
not work)

> Implement distributed pivot faceting
> 
>
> Key: SOLR-2894
> URL: https://issues.apache.org/jira/browse/SOLR-2894
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erik Hatcher
> Fix For: 4.1
>
> Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
> SOLR-2894-reworked.patch
>
>
> Following up on SOLR-792, pivot faceting currently only supports 
> undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527910#comment-13527910
 ] 

Shai Erera commented on LUCENE-4598:


Thanks, I'll run tests again just to make sure and commit.

I'm glad that it sped things up. I was skeptic that duplicating MultiDPE logic 
into PayloadIterator will improve anything, but perhaps these results show that 
people should consider moving to the per-segment API. Maybe we need a separate 
benchmark to prove that ...

PayloadIterator is just a utility class that encapsulates a logic that is 
similar to TermsEnum.seekExact. I.e., if DISI had an advanceExact(doc), I'm 
quite sure that we wouldn't need that class. I think that DISI.advanceExact is 
not that complicated to implement, and possibly even as a final method (so it 
doesn't affect any DISI out there), by calling advance() and docID(). I'll 
think about it and perhaps open an issue for it.

Then, PayloadIterator could be annihilated completely, and maybe we can make 
CLI a per-segment thing too, and FacetsAccumulator will iterate per-segment. 
It's a bigger change though, so I agree with "save that for another day" :).

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527918#comment-13527918
 ] 

Commit Tag Bot commented on LUCENE-4598:


[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revision&revision=1419397

LUCENE-4598: Change PayloadIterator to not use top-level reader API


> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4604) Implement OrdinalPolicy.NO_PARENTS

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4604:
---

Summary: Implement OrdinalPolicy.NO_PARENTS  (was: Implement 
LeavesOnlyOrdinalPolicy)

> Implement OrdinalPolicy.NO_PARENTS
> --
>
> Key: LUCENE-4604
> URL: https://issues.apache.org/jira/browse/LUCENE-4604
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
>
> Over at LUCENE-4602, Mike explored the idea of writing just the leaf nodes in 
> the fulltree posting, rather than the full hierarchy. I wrote this simple 
> OrdinalPolicy which achieves that:
> {code}
> DefaultFacetIndexingParams indexingParams = new DefaultFacetIndexingParams() {
>   @Override
>   protected OrdinalPolicy fixedOrdinalPolicy() {
> return new OrdinalPolicy() {
>   public void init(TaxonomyWriter taxonomyWriter) {}
>   public boolean shouldAdd(int ordinal) { return false; }
> };
>   }
> };
> {code}
> I think that we should add it as a singleton class to 
> OrdinalPolicy.EXACT_CATEGORIES_ONLY, as wel as make DefaultOrdPolicy as 
> singleton too, under the name FULL_HIERARCHY (feel free to suggest a better 
> name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4598.


   Resolution: Fixed
Fix Version/s: 5.0
   4.1
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. Thanks Mike !

> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Shai Erera
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4598) Change PayloadIterator to not use top-level reader API

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527929#comment-13527929
 ] 

Commit Tag Bot commented on LUCENE-4598:


[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revision&revision=1419446

LUCENE-4598: Change PayloadIterator to not use top-level reader API


> Change PayloadIterator to not use top-level reader API
> --
>
> Key: LUCENE-4598
> URL: https://issues.apache.org/jira/browse/LUCENE-4598
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Michael McCandless
>Assignee: Shai Erera
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4598.patch, LUCENE-4598.patch, LUCENE-4598.patch
>
>
> Currently the facet module uses MultiFields.* to pull the D&PEnum in 
> PayloadIterator, to access the payloads that store the facet ords.
> It then makes heavy use of .advance and .getPayload to visit all docIDs in 
> the result set.
> I think we should get some speedup if we go segment by segment instead ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4591) Make StoredFieldsFormat more configurable

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527930#comment-13527930
 ] 

Commit Tag Bot commented on LUCENE-4591:


[trunk commit] Adrien Grand
http://svn.apache.org/viewvc?view=revision&revision=1419449

LUCENE-4591: Make CompressingStoredFields{Writer,Reader} accept a segment 
suffix as a constructor parameter.



> Make StoredFieldsFormat more configurable
> -
>
> Key: LUCENE-4591
> URL: https://issues.apache.org/jira/browse/LUCENE-4591
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 4.1
>Reporter: Renaud Delbru
> Fix For: 4.1
>
> Attachments: LUCENE-4591.patch, LUCENE-4591.patch, LUCENE-4591.patch, 
> PerFieldStoredFieldsFormat.java, PerFieldStoredFieldsReader.java, 
> PerFieldStoredFieldsWriter.java
>
>
> The current StoredFieldsFormat are implemented with the assumption that only 
> one type of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, 
> similarly to the PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of 
> StoredFieldsFormat#fieldsReader and StoredFieldsFormat#fieldsWriter so that 
> they use SegmentReadState and SegmentWriteState instead of the current set of 
> parameters.
> Let us know what you think about this idea. If this is of interest, we can 
> contribute with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4604) Implement OrdinalPolicy.NO_PARENTS

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4604:
---

Attachment: LUCENE-4604.patch

Patch removes DefaultOrdinalPolicy in favor OrdinalPolicy.ALL_PARENTS. Also 
adds OrdPolicy.NO_PARENTS (related to LUCENE-4600).

In that spirit, I also removed DefaultPathPolicy in favor of 
PathPolicy.ALL_CATEGORIES.

In general, I chose specific names over e.g. DEFAULT_POLICY, because I think 
that 'default' in the name is bad. E.g., if following the results in 
LUCENE-4600 we'll decide to change the default policy, the name 'default' in 
the name will be problematic.

I improved javadocs, to explain better what OrdinalPolicy and PathPolicy are, 
and the purpos of each of the new instances.

All tests pass, I think this is ready to commit.

> Implement OrdinalPolicy.NO_PARENTS
> --
>
> Key: LUCENE-4604
> URL: https://issues.apache.org/jira/browse/LUCENE-4604
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4604.patch
>
>
> Over at LUCENE-4602, Mike explored the idea of writing just the leaf nodes in 
> the fulltree posting, rather than the full hierarchy. I wrote this simple 
> OrdinalPolicy which achieves that:
> {code}
> DefaultFacetIndexingParams indexingParams = new DefaultFacetIndexingParams() {
>   @Override
>   protected OrdinalPolicy fixedOrdinalPolicy() {
> return new OrdinalPolicy() {
>   public void init(TaxonomyWriter taxonomyWriter) {}
>   public boolean shouldAdd(int ordinal) { return false; }
> };
>   }
> };
> {code}
> I think that we should add it as a singleton class to 
> OrdinalPolicy.EXACT_CATEGORIES_ONLY, as wel as make DefaultOrdPolicy as 
> singleton too, under the name FULL_HIERARCHY (feel free to suggest a better 
> name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4155) upgrade jetty 8.1.7 -> 8.1.8

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527939#comment-13527939
 ] 

Commit Tag Bot commented on SOLR-4155:
--

[trunk commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419466

SOLR-4155: upgrade jetty to 8.1.8


> upgrade jetty 8.1.7 -> 8.1.8
> 
>
> Key: SOLR-4155
> URL: https://issues.apache.org/jira/browse/SOLR-4155
> Project: Solr
>  Issue Type: Task
>  Components: Build
>Reporter: Robert Muir
> Attachments: SOLR-4155.patch
>
>
> I think we should upgrade to the latest bugfix version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4591) Make StoredFieldsFormat more configurable

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527950#comment-13527950
 ] 

Commit Tag Bot commented on LUCENE-4591:


[branch_4x commit] Adrien Grand
http://svn.apache.org/viewvc?view=revision&revision=1419483

LUCENE-4591: Make CompressingStoredFields{Writer,Reader} accept a segment 
suffix as a constructor parameter (merged from r1419449).



> Make StoredFieldsFormat more configurable
> -
>
> Key: LUCENE-4591
> URL: https://issues.apache.org/jira/browse/LUCENE-4591
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 4.1
>Reporter: Renaud Delbru
> Fix For: 4.1
>
> Attachments: LUCENE-4591.patch, LUCENE-4591.patch, LUCENE-4591.patch, 
> PerFieldStoredFieldsFormat.java, PerFieldStoredFieldsReader.java, 
> PerFieldStoredFieldsWriter.java
>
>
> The current StoredFieldsFormat are implemented with the assumption that only 
> one type of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, 
> similarly to the PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of 
> StoredFieldsFormat#fieldsReader and StoredFieldsFormat#fieldsWriter so that 
> they use SegmentReadState and SegmentWriteState instead of the current set of 
> parameters.
> Let us know what you think about this idea. If this is of interest, we can 
> contribute with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4602) Use DocValues to store per-doc facet ord

2012-12-10 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527958#comment-13527958
 ] 

Shai Erera commented on LUCENE-4602:


I reviewed DocValuesCountingFacetsCollector, nice work !

See my last comment on LUCENE-4565 about taxoReader.getParent, vs. using the 
parents[] directly. Specifically, I wonder if we'll see any gain if we move to 
use the parents[] array directly, instead of getParent (in getFacetResults):

{code}
+  if (count != 0) {
+int ordUp = taxoReader.getParent(ord); // HERE
+while(ordUp != 0) {
+  //System.out.println("parent=" + ordUp + " cp=" + 
taxoReader.getPath(ordUp));
+  counts[ordUp] += count;
+  ordUp = taxoReader.getParent(ordUp); // AND HERE
+}
+  }
{code}

> Use DocValues to store per-doc facet ord
> 
>
> Key: LUCENE-4602
> URL: https://issues.apache.org/jira/browse/LUCENE-4602
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Attachments: LUCENE-4602.patch
>
>
> Spinoff from LUCENE-4600
> DocValues can be used to hold the byte[] encoding all facet ords for
> the document, instead of payloads.  I made a hacked up approximation
> of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
> gains were somewhat surprisingly large:
> {noformat}
> TaskQPS base  StdDevQPS comp  StdDev  
>   Pct diff
> HighTerm0.53  (0.9%)1.00  (2.5%)   
> 87.3% (  83% -   91%)
>  LowTerm7.59  (0.6%)   26.75 (12.9%)  
> 252.6% ( 237% -  267%)
>  MedTerm3.35  (0.7%)   12.71  (9.0%)  
> 279.8% ( 268% -  291%)
> {noformat}
> I didn't think payloads were THAT slow; I think it must be the advance
> implementation?
> We need to separately test on-disk DV to make sure it's at least
> on-par with payloads (but hopefully faster) and if so ... we should
> cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Simon Willnauer (JIRA)

Simon Willnauer created LUCENE-4607:
---

 Summary: Add estimateDocCount to DocIdSetIterator
 Key: LUCENE-4607
 URL: https://issues.apache.org/jira/browse/LUCENE-4607
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0


this is essentially a spinnoff from LUCENE-4236
We currently have no way to make any decsision on how costly a DISI is neither 
when we apply filters nor when we build conjunctions in BQ. Yet we have most of 
the information already and can easily expose them via a cost API such that BS 
and FilteredQuery can apply optimizations on per segment basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4607:


Attachment: LUCENE-4607.patch

here is a patch that adds #estimateDocCount to DISI. It still has some 
nocommits mainly related to BitSets and carnality but I this its fine as a 
start. I removed the TermConjunction specialization and changed the heuristic 
in FilteredQuery to use the estimated cost.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527982#comment-13527982
 ] 

Uwe Schindler commented on LUCENE-4607:
---

Nice idea! At least the FilteredQuery code looks fine to me. I agree, the Bits 
interface and FixedBitSet has a costly cardinality (Bits does not have it at 
all...), so we should think about that. As far as I see it returns maxDoc as 
cost.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4608) Handle large number of requested fragments better.

2012-12-10 Thread Martijn van Groningen (JIRA)

Martijn van Groningen created LUCENE-4608:
-

 Summary: Handle large number of requested fragments better.
 Key: LUCENE-4608
 URL: https://issues.apache.org/jira/browse/LUCENE-4608
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.0
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527986#comment-13527986
 ] 

Uwe Schindler commented on LUCENE-4607:
---

One thing: estimatedDocCount is long, but docIds in Lucene are still int - this 
makes no sense, because the current scorer/disi interface can never return 
anything > Integer.MAX_VALUE, so the estimatedDocCount can never be 64 bits.

We should maybe rethink in the future to make docIds in Lucene longs, but until 
this has happened, we should not mix both datatypes in public APIs, this would 
cause confusion.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4608) Handle large number of requested fragments better.

2012-12-10 Thread Martijn van Groningen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-4608:
--

Attachment: LUCENE-4608.patch

Small improvement when requesting high number of fragments for fvh. For example 
this allows uses to specify Integer.MAX_VALUE.

> Handle large number of requested fragments better.
> --
>
> Key: LUCENE-4608
> URL: https://issues.apache.org/jira/browse/LUCENE-4608
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Affects Versions: 4.0
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 4.1
>
> Attachments: LUCENE-4608.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527995#comment-13527995
 ] 

Robert Muir commented on LUCENE-4607:
-

When i did the cost estimate patch on LUCENE-4236, i chose a long too. but 
there it was trying to estimate the number of documents visited,
e.g. the number of postings.

so the formula for a conjunction would be min(subscorer cost) * #subscorers, 
and for a disjunction its just the sum of all the subscorer costs, and so on.

I felt like for scoring purposes this is more useful than the number of 
documents, but thats just my opinion.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4604) Implement OrdinalPolicy.NO_PARENTS

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4604.


   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

Committed changes to trunk and 4x.

> Implement OrdinalPolicy.NO_PARENTS
> --
>
> Key: LUCENE-4604
> URL: https://issues.apache.org/jira/browse/LUCENE-4604
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4604.patch
>
>
> Over at LUCENE-4602, Mike explored the idea of writing just the leaf nodes in 
> the fulltree posting, rather than the full hierarchy. I wrote this simple 
> OrdinalPolicy which achieves that:
> {code}
> DefaultFacetIndexingParams indexingParams = new DefaultFacetIndexingParams() {
>   @Override
>   protected OrdinalPolicy fixedOrdinalPolicy() {
> return new OrdinalPolicy() {
>   public void init(TaxonomyWriter taxonomyWriter) {}
>   public boolean shouldAdd(int ordinal) { return false; }
> };
>   }
> };
> {code}
> I think that we should add it as a singleton class to 
> OrdinalPolicy.EXACT_CATEGORIES_ONLY, as wel as make DefaultOrdPolicy as 
> singleton too, under the name FULL_HIERARCHY (feel free to suggest a better 
> name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4604) Implement OrdinalPolicy.NO_PARENTS

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528001#comment-13528001
 ] 

Commit Tag Bot commented on LUCENE-4604:


[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revision&revision=1419521

LUCENE-4604: Implement OrdinalPolicy.NO_PARENTS


> Implement OrdinalPolicy.NO_PARENTS
> --
>
> Key: LUCENE-4604
> URL: https://issues.apache.org/jira/browse/LUCENE-4604
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4604.patch
>
>
> Over at LUCENE-4602, Mike explored the idea of writing just the leaf nodes in 
> the fulltree posting, rather than the full hierarchy. I wrote this simple 
> OrdinalPolicy which achieves that:
> {code}
> DefaultFacetIndexingParams indexingParams = new DefaultFacetIndexingParams() {
>   @Override
>   protected OrdinalPolicy fixedOrdinalPolicy() {
> return new OrdinalPolicy() {
>   public void init(TaxonomyWriter taxonomyWriter) {}
>   public boolean shouldAdd(int ordinal) { return false; }
> };
>   }
> };
> {code}
> I think that we should add it as a singleton class to 
> OrdinalPolicy.EXACT_CATEGORIES_ONLY, as wel as make DefaultOrdPolicy as 
> singleton too, under the name FULL_HIERARCHY (feel free to suggest a better 
> name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4604) Implement OrdinalPolicy.NO_PARENTS

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528005#comment-13528005
 ] 

Commit Tag Bot commented on LUCENE-4604:


[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revision&revision=1419524

LUCENE-4604: Implement OrdinalPolicy.NO_PARENTS


> Implement OrdinalPolicy.NO_PARENTS
> --
>
> Key: LUCENE-4604
> URL: https://issues.apache.org/jira/browse/LUCENE-4604
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4604.patch
>
>
> Over at LUCENE-4602, Mike explored the idea of writing just the leaf nodes in 
> the fulltree posting, rather than the full hierarchy. I wrote this simple 
> OrdinalPolicy which achieves that:
> {code}
> DefaultFacetIndexingParams indexingParams = new DefaultFacetIndexingParams() {
>   @Override
>   protected OrdinalPolicy fixedOrdinalPolicy() {
> return new OrdinalPolicy() {
>   public void init(TaxonomyWriter taxonomyWriter) {}
>   public boolean shouldAdd(int ordinal) { return false; }
> };
>   }
> };
> {code}
> I think that we should add it as a singleton class to 
> OrdinalPolicy.EXACT_CATEGORIES_ONLY, as wel as make DefaultOrdPolicy as 
> singleton too, under the name FULL_HIERARCHY (feel free to suggest a better 
> name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528013#comment-13528013
 ] 

Simon Willnauer commented on LUCENE-4607:
-

I tend to agree with robert that using longs makes things a lot easier here 
too. We don't need to deal with int overflows. Maybe we should rename to 
estimateDocsVisited?


> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1028) Automatic core loading unloading for multicore

2012-12-10 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528011#comment-13528011
 ] 

Erick Erickson commented on SOLR-1028:
--

Has anyone with a machine that testLazyCores used to fail on seen failures 
since I committed my attempt at a fix? I committed SOLR-4149 last week, don't 
quite know if it's had enough time to really say it's fixed, unless we're 
seeing more failures

FWIW, if it's what I think it was, it was a test artifact rather than the 
underlying code. I can hope anyway.

> Automatic core loading unloading for multicore
> --
>
> Key: SOLR-1028
> URL: https://issues.apache.org/jira/browse/SOLR-1028
> Project: Solr
>  Issue Type: New Feature
>  Components: multicore
>Affects Versions: 4.0, 5.0
>Reporter: Noble Paul
>Assignee: Erick Erickson
> Fix For: 4.1, 5.0
>
> Attachments: jenkins.jpg, SOLR-1028.patch, SOLR-1028.patch, 
> SOLR-1028_testnoise.patch
>
>
> usecase: I have many small cores (say one per user) on a single Solr box . 
> All the cores are not be always needed . But when I need it I should be able 
> to directly issue a search request and the core must be STARTED automatically 
> and the request must be served.
> This also requires that I must have an upper limit on the no:of cores that 
> should be loaded at any given point in time. If the limit is crossed the 
> CoreContainer must unload a core (preferably the least recently used core)  
> There must be a choice of specifying some cores as fixed. These cores must 
> never be unloaded 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528017#comment-13528017
 ] 

Robert Muir commented on LUCENE-4607:
-

The other idea (just for discussion) would be "number of i/os".

So for example, phrasequery's formula would likely use totalTermFreq rather 
than docFreq.
This would reflect the fact that its much more expensive than a conjunction.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Dyer, James

I do apologize for causing problems.  But I do usually merge.  However, if it 
is a trivial change (say, just a small test fix) it is a ton faster to just 
make the change to both branches instead of a merge.  I guess I do not 
understand why this causes problems with seemingly unrelated code (I can be 
pretty sure the code involved with LUCENE-4585 is entirely separate than code 
I've been modifying).  Is it really a bad thing to make a trivial change this 
way?

Perhaps the issue is when I do a merge, if I notice directories that have 
property changes only I omit them.  Should I be including these?  Often these 
are seeming random directories and I never quite understand why these are being 
included.  (Maybe its just my ignorance of svn.)  Perhaps this is the problem?

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Sunday, December 09, 2012 3:59 AM
To: dev@lucene.apache.org
Subject: RE: lost entries in trunk/lecene/CHANGES.txt

Hi,

I checked a little bit in the commit logs what was going on. From what I can 
reconstruct:

-James Dyer did not use SVN merging to 4.x, he copied the whole file 
into the 4.x folder, this explains why the 5.0 changes entries suddenly 
appeared in the 4.x brach (which I removed yesterday). James seems to never 
merge his changes between branches, he applies patch several times or just 
copies files.

-The commit where the entries got lost, that Doron restored an hour 
ago, seems to have copied an older version of the CHANGES.txt file over the 
newer version in SVN. This should be impossible with SVN, unless you “svn up” 
your current Working directory and fix the conflicts by telling SVN to use the 
older modified (“your”) version instead of doing 3-way-merge. One should use 
3-way-merge to do this (e.g. with TortoiseSVN or Subclipse or by hand, arrgh 
☺). It looks like James created the patch with an older SVN checkout but failed 
to merge the changes.

James: Can you in the future please use “svn merge” (or the corresponding 
workflow in your GUI) to merge the changes between branches. This merge adds 
special “properties” to the SVN log, so one can find out which patches were 
merged between branches. E.g. TortoiseSVN or Subclipse show those in a 
different color in the commit log which helps immense if you are about to merge 
some changes. If you need some help with merging correctly, read 
http://wiki.apache.org/lucene-java/SvnMerge or just ask me.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Sunday, December 09, 2012 10:15 AM
To: dev@lucene.apache.org
Subject: RE: lost entries in trunk/lecene/CHANGES.txt

They were partly (but in a different way also missing in 4.x). I synced the 
part from version 4.1 down to version 0 with trunk. 3 entries were missing. 
Trunk now only has 5.0 as additional section, remaining stuff is identical.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

From: Doron Cohen [mailto:cdor...@gmail.com]
Sent: Sunday, December 09, 2012 9:30 AM
To: dev@lucene.apache.org
Subject: lost entries in trunk/lecene/CHANGES.txt

Hi, seems some entries were lost when committing LUCENE-4585 (Spatial 
PrefixTree based Strategies).
http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?r1=1418005&r2=1418006&pathrev=1418006&view=diff
I think I'll just add them back...
Doron

Re: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Robert Muir

On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James
 wrote:

> Perhaps the issue is when I do a merge, if I notice directories that have
> property changes only I omit them.  Should I be including these?  Often
> these are seeming random directories and I never quite understand why these
> are being included.  (Maybe its just my ignorance of svn.)  Perhaps this is
> the problem?
>

Are you using svn 1.7? I really recommend this!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4155) upgrade jetty 8.1.7 -> 8.1.8

2012-12-10 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-4155.
---

   Resolution: Fixed
Fix Version/s: 5.0
   4.1

> upgrade jetty 8.1.7 -> 8.1.8
> 
>
> Key: SOLR-4155
> URL: https://issues.apache.org/jira/browse/SOLR-4155
> Project: Solr
>  Issue Type: Task
>  Components: Build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4155.patch
>
>
> I think we should upgrade to the latest bugfix version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-4591) Make StoredFieldsFormat more configurable

2012-12-10 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-4591.


Resolution: Fixed
  Assignee: Adrien Grand

> Make StoredFieldsFormat more configurable
> -
>
> Key: LUCENE-4591
> URL: https://issues.apache.org/jira/browse/LUCENE-4591
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 4.1
>Reporter: Renaud Delbru
>Assignee: Adrien Grand
> Fix For: 4.1
>
> Attachments: LUCENE-4591.patch, LUCENE-4591.patch, LUCENE-4591.patch, 
> PerFieldStoredFieldsFormat.java, PerFieldStoredFieldsReader.java, 
> PerFieldStoredFieldsWriter.java
>
>
> The current StoredFieldsFormat are implemented with the assumption that only 
> one type of StoredfieldsFormat is used by the index.
> We would like to be able to configure a StoredFieldsFormat per field, 
> similarly to the PostingsFormat.
> There is a few issues that need to be solved for allowing that:
> 1) allowing to configure a segment suffix to the StoredFieldsFormat
> 2) implement SPI interface in StoredFieldsFormat 
> 3) create a PerFieldStoredFieldsFormat
> We are proposing to start first with 1) by modifying the signature of 
> StoredFieldsFormat#fieldsReader and StoredFieldsFormat#fieldsWriter so that 
> they use SegmentReadState and SegmentWriteState instead of the current set of 
> parameters.
> Let us know what you think about this idea. If this is of interest, we can 
> contribute with a first path for 1).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528032#comment-13528032
 ] 

Simon Willnauer commented on LUCENE-4607:
-

bq. The other idea (just for discussion) would be "number of i/os".
I like this, yet I think in that case we should go back to estimateCost rather 
than docId etc. since for a bitset this is way different that for a 
PhraseScorer. I agree it should be a unit of work that we estimate.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4155) upgrade jetty 8.1.7 -> 8.1.8

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528033#comment-13528033
 ] 

Commit Tag Bot commented on SOLR-4155:
--

[branch_4x commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419547

SOLR-4155: upgrade jetty to 8.1.8


> upgrade jetty 8.1.7 -> 8.1.8
> 
>
> Key: SOLR-4155
> URL: https://issues.apache.org/jira/browse/SOLR-4155
> Project: Solr
>  Issue Type: Task
>  Components: Build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4155.patch
>
>
> I think we should upgrade to the latest bugfix version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: svn commit: r1419551 - /lucene/dev/trunk/solr/example/ivy.xml

2012-12-10 Thread Uwe Schindler

Approved by the XML policeman :-)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: rm...@apache.org [mailto:rm...@apache.org]
> Sent: Monday, December 10, 2012 5:12 PM
> To: comm...@lucene.apache.org
> Subject: svn commit: r1419551 - /lucene/dev/trunk/solr/example/ivy.xml
> 
> Author: rmuir
> Date: Mon Dec 10 16:12:25 2012
> New Revision: 1419551
> 
> URL: http://svn.apache.org/viewvc?rev=1419551&view=rev
> Log:
> declare jetty.version as entity to remove the 10x duplication (asking for
> trouble upgrading in the future)
> 
> Modified:
> lucene/dev/trunk/solr/example/ivy.xml
> 
> Modified: lucene/dev/trunk/solr/example/ivy.xml
> URL:
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/ivy.xml?rev=
> 1419551&r1=1419550&r2=1419551&view=diff
> ==
> 
> --- lucene/dev/trunk/solr/example/ivy.xml (original)
> +++ lucene/dev/trunk/solr/example/ivy.xml Mon Dec 10 16:12:25 2012
> @@ -16,6 +16,9 @@
> specific language governing permissions and limitations
> under the License.
>  -->
> + +   ]>
>  
>  
>  
> @@ -25,18 +28,18 @@
>  
> 
>  
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
> -   rev="8.1.8.v20121106" transitive="false" conf="start->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   rev="&jetty.version;" transitive="false" conf="jetty->default"/>
> +   + rev="&jetty.version;" transitive="false" conf="start->default"/>
> rev="3.0.0.v201112011016" transitive="false" conf="servlet->default">
>  
>



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Dyer, James

I'm using subclipse with JavaHL 1.7.7.  I am unclear is JavaHL keeps its 
equivenence versioning the same as official svn versions?  I do not have an 
official svn command line installed, do not use tortise or other tools, etc.

Reading Uwe's comment that I "never merge", I do wonder if its just that I 
should let the directory property changes merge in also even if I do not 
understand them.  I just don't like to commit stuff that seems unrelated to 
what I'm doing and I don't understand.  This fits because if it appears I 
"never merge", I also "always" omit seemingly unrelated property changes when 
committing a merge.  

I also would like an answer to my question: "Is it ok to make parallel changes 
instead of a merge if its just a trivial change?"  Follow up question:  "is it 
ok to make the same (trivial) change to 2 branches with 1 commit?"  It really 
is very slow for me to merge and if the way I've handled trivial changes in the 
past breaks things for other people, I can change my ways, or just not fix tiny 
things if time doesn't allow.

Especially when I get an unexpected jenkins test failure, I'm usually in the 
middle of something else and really want to fix jenkins asap but can't always 
give it a lot of time (getting more coffee, as you might say to do, Robert) 
while waiting for svn, etc.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Monday, December 10, 2012 9:57 AM
To: dev@lucene.apache.org
Subject: Re: lost entries in trunk/lecene/CHANGES.txt

On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James
 wrote:

> Perhaps the issue is when I do a merge, if I notice directories that have
> property changes only I omit them.  Should I be including these?  Often
> these are seeming random directories and I never quite understand why these
> are being included.  (Maybe its just my ignorance of svn.)  Perhaps this is
> the problem?
>

Are you using svn 1.7? I really recommend this!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Robert Muir

ok... i'm not too familiar with subclipse. i use the svn commandline
for all operations.

I just know that I can tell who is using svn 1.6 versus 1.7 by their
commit messages. the commits using 1.6 make a lot of noise when
merging.
with 1.7 i dont have unrelated merge properties ...
also merge is fast for me (faster and more convenient than using 'patch').

Maybe the problems you are having are related to subclipse... if
someone else is using this they might have some advice, sorry I can't
help more

On Mon, Dec 10, 2012 at 11:28 AM, Dyer, James
 wrote:
> I'm using subclipse with JavaHL 1.7.7.  I am unclear is JavaHL keeps its 
> equivenence versioning the same as official svn versions?  I do not have an 
> official svn command line installed, do not use tortise or other tools, etc.
>
> Reading Uwe's comment that I "never merge", I do wonder if its just that I 
> should let the directory property changes merge in also even if I do not 
> understand them.  I just don't like to commit stuff that seems unrelated to 
> what I'm doing and I don't understand.  This fits because if it appears I 
> "never merge", I also "always" omit seemingly unrelated property changes when 
> committing a merge.
>
> I also would like an answer to my question: "Is it ok to make parallel 
> changes instead of a merge if its just a trivial change?"  Follow up 
> question:  "is it ok to make the same (trivial) change to 2 branches with 1 
> commit?"  It really is very slow for me to merge and if the way I've handled 
> trivial changes in the past breaks things for other people, I can change my 
> ways, or just not fix tiny things if time doesn't allow.
>
> Especially when I get an unexpected jenkins test failure, I'm usually in the 
> middle of something else and really want to fix jenkins asap but can't always 
> give it a lot of time (getting more coffee, as you might say to do, Robert) 
> while waiting for svn, etc.
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -Original Message-
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Monday, December 10, 2012 9:57 AM
> To: dev@lucene.apache.org
> Subject: Re: lost entries in trunk/lecene/CHANGES.txt
>
> On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James
>  wrote:
>
>> Perhaps the issue is when I do a merge, if I notice directories that have
>> property changes only I omit them.  Should I be including these?  Often
>> these are seeming random directories and I never quite understand why these
>> are being included.  (Maybe its just my ignorance of svn.)  Perhaps this is
>> the problem?
>>
>
> Are you using svn 1.7? I really recommend this!
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: lost entries in trunk/lecene/CHANGES.txt

2012-12-10 Thread Uwe Schindler

> I'm using subclipse with JavaHL 1.7.7.  I am unclear is JavaHL keeps its
> equivenence versioning the same as official svn versions?  I do not have an
> official svn command line installed, do not use tortise or other tools, etc.
> 
> Reading Uwe's comment that I "never merge", I do wonder if its just that I
> should let the directory property changes merge in also even if I do not

I did not say "you never merge", it appeared like that to me. The problems with 
CHANGES.txt are indeed strange to me, but you or your software restored an 
older version of CHANGES.txt. So it looked to me, like subclipse did not merge 
the changes correctly (when svn up).

> understand them.  I just don't like to commit stuff that seems unrelated to
> what I'm doing and I don't understand.  This fits because if it appears I 
> "never
> merge", I also "always" omit seemingly unrelated property changes when
> committing a merge.

Keep them! The unrelated property changes are caused by the fact that 
properties cannot be "inherited". Once a directory has a merge property (caused 
by a previous commit), this merge property must be updated by later commits - 
causing unrelated property changes on merges. This gets worse when people do 
commits on single files.

We sometimes clean up the merge properties, but you should keep all property 
changes done automatically during merging. If you omit them on unrelated 
directories/files, those appear as not merged to Subversion, causing confusion, 
especially when people merge later (e.g. the CHANGES.txt problems could be 
related to this, because CHANGES.txt has one of those extra properties).

We don't list property changes in commit messages on ML, so it don’t hurts to 
commit them.

> I also would like an answer to my question: "Is it ok to make parallel changes
> instead of a merge if its just a trivial change?"  Follow up question:  "is 
> it ok to
> make the same (trivial) change to 2 branches with 1 commit?"  It really is 
> very
> slow for me to merge and if the way I've handled trivial changes in the past
> breaks things for other people, I can change my ways, or just not fix tiny
> things if time doesn't allow.

> Especially when I get an unexpected jenkins test failure, I'm usually in the
> middle of something else and really want to fix jenkins asap but can't always
> give it a lot of time (getting more coffee, as you might say to do, Robert)
> while waiting for svn, etc.
> 
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
> 
> 
> -Original Message-
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Monday, December 10, 2012 9:57 AM
> To: dev@lucene.apache.org
> Subject: Re: lost entries in trunk/lecene/CHANGES.txt
> 
> On Mon, Dec 10, 2012 at 10:53 AM, Dyer, James
>  wrote:
> 
> > Perhaps the issue is when I do a merge, if I notice directories that
> > have property changes only I omit them.  Should I be including these?
> > Often these are seeming random directories and I never quite
> > understand why these are being included.  (Maybe its just my ignorance
> > of svn.)  Perhaps this is the problem?
> >
> 
> Are you using svn 1.7? I really recommend this!
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4609) Writer a PackedIntsEncoder/Decoder for facets

2012-12-10 Thread Shai Erera (JIRA)

Shai Erera created LUCENE-4609:
--

 Summary: Writer a PackedIntsEncoder/Decoder for facets
 Key: LUCENE-4609
 URL: https://issues.apache.org/jira/browse/LUCENE-4609
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/facet
Reporter: Shai Erera
Priority: Minor


Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
category ordinals. We have several such encoders, including VInt (default), and 
block encoders.

It would be interesting to implement and benchmark a PackedIntsEncoder/Decoder, 
with potentially two variants: (1) receives bitsPerValue up front, when you 
e.g. know that you have a small taxonomy and the max value you can see and (2) 
one that decides for each doc on the optimal bitsPerValue, writes it as a 
header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4610) Implement a NoParentsAccumulator

2012-12-10 Thread Shai Erera (JIRA)

Shai Erera created LUCENE-4610:
--

 Summary: Implement a NoParentsAccumulator
 Key: LUCENE-4610
 URL: https://issues.apache.org/jira/browse/LUCENE-4610
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/facet
Reporter: Shai Erera


Mike experimented with encoding just the exact categories ordinals on 
LUCENE-4602, and I added OrdinalPolicy.NO_PARENTS, with a comment saying that 
this requires a special FacetsAccumulator.

The idea is to write the exact categories only for each document, and then at 
search time count up the parents chain to compute requested facets (I say 
count, but it can be any weight).

One limitation of such accumulator is that it cannot be used when e.g. a 
document is associated with two categories who share the same parent, because 
that may result in incorrect weights computed (e.g. a document might have 
several Authors, and so counting the Author facet may yield wrong counts). So 
it can be used only when the app knows it doesn't add such facets, or that it 
always asks to aggregate a 'root' that in its path this criteria doesn't hold 
(no categories share the same parent).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528059#comment-13528059
 ] 

Uwe Schindler commented on LUCENE-4607:
---

I am fine with any solution. From my perspective as "API policeman" the mix of 
int and long in the same interface is not good. if estimateDocCount() returns 
long, also advance() must take and return long, docId() must return long, 
FixedBitSet must take long as size and finally numDocs and maxDoc must be long.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528063#comment-13528063
 ] 

Robert Muir commented on LUCENE-4607:
-

Uwe the current discussion is about not measuring count of documents, but 
instead i/o operations.

So advance(), docId(), fixedBitset and so on are totally unrelated to that.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4609) Writer a PackedIntsEncoder/Decoder for facets

2012-12-10 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528068#comment-13528068
 ] 

Adrien Grand commented on LUCENE-4609:
--

I just looked at Int{De,En}coder and wrapping a PackedInts 
ReaderIterator/Writer (or maybe directly a Decoder/Encoder) looks easy to 
implement. Don't hesitate to let me know if you have questions regarding the 
PackedInts API!

> Writer a PackedIntsEncoder/Decoder for facets
> -
>
> Key: LUCENE-4609
> URL: https://issues.apache.org/jira/browse/LUCENE-4609
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Shai Erera
>Priority: Minor
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4609) Writer a PackedIntsEncoder/Decoder for facets

2012-12-10 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528068#comment-13528068
 ] 

Adrien Grand edited comment on LUCENE-4609 at 12/10/12 5:04 PM:


I just looked at IntDecoder/IntEncoder and wrapping a PackedInts 
ReaderIterator/Writer (or maybe directly a Decoder/Encoder) looks easy to 
implement. Don't hesitate to let me know if you have questions regarding the 
PackedInts API!

  was (Author: jpountz):
I just looked at Int{De,En}coder and wrapping a PackedInts 
ReaderIterator/Writer (or maybe directly a Decoder/Encoder) looks easy to 
implement. Don't hesitate to let me know if you have questions regarding the 
PackedInts API!
  
> Writer a PackedIntsEncoder/Decoder for facets
> -
>
> Key: LUCENE-4609
> URL: https://issues.apache.org/jira/browse/LUCENE-4609
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Shai Erera
>Priority: Minor
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528069#comment-13528069
 ] 

Uwe Schindler commented on LUCENE-4607:
---

bq. Uwe the current discussion is about not measuring count of documents, but 
instead i/o operations.

Man, I just wanted to make clear that the current patch and the current issue 
summary have this problem.

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4607) Add estimateDocCount to DocIdSetIterator

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528076#comment-13528076
 ] 

Robert Muir commented on LUCENE-4607:
-

ok: I agree if its a count of documents, the type should be consistent.

But as i suggested i dont think a count of documents is that great for how we 
will use this (picking conjunction leader, filtering heuristics, maybe 
minimum-should-match disjunction scoring, maybe cleanup exact phrase scorer / 
add its optimizations to sloppy phrase scorer, maybe more BS versus BS2 
heuristics in BooleanWeight, etc etc)

> Add estimateDocCount to DocIdSetIterator
> 
>
> Key: LUCENE-4607
> URL: https://issues.apache.org/jira/browse/LUCENE-4607
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4607.patch
>
>
> this is essentially a spinnoff from LUCENE-4236
> We currently have no way to make any decsision on how costly a DISI is 
> neither when we apply filters nor when we build conjunctions in BQ. Yet we 
> have most of the information already and can easily expose them via a cost 
> API such that BS and FilteredQuery can apply optimizations on per segment 
> basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4609) Write a PackedIntsEncoder/Decoder for facets

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4609:
---

Summary: Write a PackedIntsEncoder/Decoder for facets  (was: Writer a 
PackedIntsEncoder/Decoder for facets)

> Write a PackedIntsEncoder/Decoder for facets
> 
>
> Key: LUCENE-4609
> URL: https://issues.apache.org/jira/browse/LUCENE-4609
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Shai Erera
>Priority: Minor
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4609) Write a PackedIntsEncoder/Decoder for facets

2012-12-10 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528077#comment-13528077
 ] 

Shai Erera commented on LUCENE-4609:


Thanks Adrien. I would like to implement a specialized Encoder/Decoder, rather 
than wrapping them w/ PackedInts Reder/Writer. I sure would appreciate your 
review once I have a patch !

> Write a PackedIntsEncoder/Decoder for facets
> 
>
> Key: LUCENE-4609
> URL: https://issues.apache.org/jira/browse/LUCENE-4609
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/facet
>Reporter: Shai Erera
>Priority: Minor
>
> Today the facets API lets you write IntEncoder/Decoder to encode/decode the 
> category ordinals. We have several such encoders, including VInt (default), 
> and block encoders.
> It would be interesting to implement and benchmark a 
> PackedIntsEncoder/Decoder, with potentially two variants: (1) receives 
> bitsPerValue up front, when you e.g. know that you have a small taxonomy and 
> the max value you can see and (2) one that decides for each doc on the 
> optimal bitsPerValue, writes it as a header in the byte[] or something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: commit message format for tag bot

2012-12-10 Thread Robert Muir

On Sun, Dec 9, 2012 at 11:04 AM, Doron Cohen  wrote:
> Thanks Mark, this bot is very helpful, and now even more so!
>
>

I am very happy to see the commit comments to the JIRA issues again!

Thanks for doing this Mark

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4605) Add FLAGS_NONE to DocsEnum and DocsAndPositionsEnum

2012-12-10 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4605:
---

Attachment: LUCENE-4605.patch

Patch adds DocsEnum.FLAG_NONE with proper javadocs. I also modified all places 
in the code that I could find which passed 0, to pass the new constant.

Basically a trivial change. 'core' tests passed. If there are no objections, I 
will commit it later today.

> Add FLAGS_NONE to DocsEnum and DocsAndPositionsEnum
> ---
>
> Key: LUCENE-4605
> URL: https://issues.apache.org/jira/browse/LUCENE-4605
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4605.patch
>
>
> Add a convenience constants FLAGS_NONE to DocsEnum and DocsAndPositionsEnum. 
> Today, if someone e.g. wants to get the docs only, he needs to pass 0 as the 
> flags, but the value of 0 is not documented anywhere. I had to dig in the 
> code the verify that indeed that's the value.
> I'll attach a patch later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4611:


Attachment: LUCENE-4611.patch

here's the patch I am currently testing

> remove duplicate 3rd party versioning from build.xmls
> -
>
> Key: LUCENE-4611
> URL: https://issues.apache.org/jira/browse/LUCENE-4611
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Attachments: LUCENE-4611.patch
>
>
> At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
> didnt want to have this in the classpath because its just noisy.
> But nowadays we use an ivy sync=true, the only things in lib are the jars in 
> the ivy.xml.
> So we can just put lib/ dirs in the classpath and not have redundant jar 
> names also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Robert Muir (JIRA)

Robert Muir created LUCENE-4611:
---

 Summary: remove duplicate 3rd party versioning from build.xmls
 Key: LUCENE-4611
 URL: https://issues.apache.org/jira/browse/LUCENE-4611
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-4611.patch

At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
didnt want to have this in the classpath because its just noisy.

But nowadays we use an ivy sync=true, the only things in lib are the jars in 
the ivy.xml.

So we can just put lib/ dirs in the classpath and not have redundant jar names 
also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4028) When using ZK chroot, it would be nice if Solr would create the initial path when it doesn't exist.

2012-12-10 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-4028:


Attachment: SOLR-4028.patch

With this patch, the initial path is created only when bootstrap_conf or 
boostrap_confdir are specified. 
ZkCli also creates the initial path when the command is upconfig or bootstrap 
are used

> When using ZK chroot, it would be nice if Solr would create the initial path 
> when it doesn't exist.
> ---
>
> Key: SOLR-4028
> URL: https://issues.apache.org/jira/browse/SOLR-4028
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Tomás Fernández Löbbe
>Priority: Minor
> Attachments: SOLR-4028.patch, SOLR-4028.patch
>
>
> I think this would make it easier to test and develop with SolrCloud, in 
> order to start with a fresh ZK directory now the approach is to delete ZK 
> data, with this improvement one could just add a chroot to the zkHost like:
> java -DzkHost=localhost:2181/testXYZ -jar start.jar
> Right now this is possible but you have to manually create the initial path. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528105#comment-13528105
 ] 

Robert Muir commented on LUCENE-4611:
-

precommit+test passes on a clean checkout. I'll do a nightly-smoke just for 
kicks but I think this is ready.

> remove duplicate 3rd party versioning from build.xmls
> -
>
> Key: LUCENE-4611
> URL: https://issues.apache.org/jira/browse/LUCENE-4611
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Attachments: LUCENE-4611.patch
>
>
> At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
> didnt want to have this in the classpath because its just noisy.
> But nowadays we use an ivy sync=true, the only things in lib are the jars in 
> the ivy.xml.
> So we can just put lib/ dirs in the classpath and not have redundant jar 
> names also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.7.0_09) - Build # 2121 - Failure!

2012-12-10 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/2121/
Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC

1 tests failed.
REGRESSION:  
org.apache.solr.spelling.SpellCheckCollatorTest.testContextSensitiveCollate

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([3FACDC7EBD23CB80:3D65D783617F94F1]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:513)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:480)
at 
org.apache.solr.spelling.SpellCheckCollatorTest.testContextSensitiveCollate(SpellCheckCollatorTest.java:380)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//lst[@name=

[jira] [Updated] (SOLR-4118) fix replicationFactor to align with industry usage

2012-12-10 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-4118:
---

Attachment: SOLR-4118.patch

Patch attached.

replicationFactor=3 means that there will be a target total of 3 physical 
indexes for each slice/partition (or from a document perspective, there will be 
3 copies of each document in the cluster).

> fix replicationFactor to align with industry usage 
> ---
>
> Key: SOLR-4118
> URL: https://issues.apache.org/jira/browse/SOLR-4118
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4118.patch
>
>
> replicationFactor should be the number of different nodes that have a 
> document.
> See discussion in SOLR-4114

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4594) Spatial PrefixTreeStrategy shouldn't index center-points with shapes together

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528137#comment-13528137
 ] 

Commit Tag Bot commented on LUCENE-4594:


[trunk commit] David Wayne Smiley
http://svn.apache.org/viewvc?view=revision&revision=1419630

LUCENE-4594: PrefixTreeStrategy should not index center points


> Spatial PrefixTreeStrategy shouldn't index center-points with shapes together
> -
>
> Key: LUCENE-4594
> URL: https://issues.apache.org/jira/browse/LUCENE-4594
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial
>Affects Versions: 4.0, 5.0
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 4.1, 5.0
>
> Attachments: 
> LUCENE-4594__PrefixTreeStrategy_should_not_index_center_points.patch
>
>
> The Spatial PrefixTreeStrategy will index the center-point of a non-point 
> shape it is given to index, in addition to the shape itself of course.  The 
> rationale was that this point could be picked up by 
> PointPrefixTreeFieldCacheProvider for distance/sorting.  However this 
> approach is buggy since the distinction of grid cells between the center 
> point and the shape itself is lost when the shape gets indexed down to 
> max-levels precision -- each grid cell therein appears to be another point 
> that needs to be brought into memory.  It's also possible that the shape is a 
> LineString or some other non-trivial shape in which its center point isn't 
> actually in the shape.
> Even if you knew this problem would never happen, I think you're better off 
> indexing center points into another spatial field if you want them.  Perhaps 
> arguably this strategy could do that internally?  Wether or not that ends up 
> happening, I just want to remove the problematic behavior now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4028) When using ZK chroot, it would be nice if Solr would create the initial path when it doesn't exist.

2012-12-10 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528140#comment-13528140
 ] 

Mark Miller commented on SOLR-4028:
---

Cool, thanks.

> When using ZK chroot, it would be nice if Solr would create the initial path 
> when it doesn't exist.
> ---
>
> Key: SOLR-4028
> URL: https://issues.apache.org/jira/browse/SOLR-4028
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Tomás Fernández Löbbe
>Priority: Minor
> Attachments: SOLR-4028.patch, SOLR-4028.patch
>
>
> I think this would make it easier to test and develop with SolrCloud, in 
> order to start with a fresh ZK directory now the approach is to delete ZK 
> data, with this improvement one could just add a chroot to the zkHost like:
> java -DzkHost=localhost:2181/testXYZ -jar start.jar
> Right now this is possible but you have to manually create the initial path. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4594) Spatial PrefixTreeStrategy shouldn't index center-points with shapes together

2012-12-10 Thread David Smiley (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved LUCENE-4594.
--

Resolution: Fixed

> Spatial PrefixTreeStrategy shouldn't index center-points with shapes together
> -
>
> Key: LUCENE-4594
> URL: https://issues.apache.org/jira/browse/LUCENE-4594
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial
>Affects Versions: 4.0, 5.0
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 4.1, 5.0
>
> Attachments: 
> LUCENE-4594__PrefixTreeStrategy_should_not_index_center_points.patch
>
>
> The Spatial PrefixTreeStrategy will index the center-point of a non-point 
> shape it is given to index, in addition to the shape itself of course.  The 
> rationale was that this point could be picked up by 
> PointPrefixTreeFieldCacheProvider for distance/sorting.  However this 
> approach is buggy since the distinction of grid cells between the center 
> point and the shape itself is lost when the shape gets indexed down to 
> max-levels precision -- each grid cell therein appears to be another point 
> that needs to be brought into memory.  It's also possible that the shape is a 
> LineString or some other non-trivial shape in which its center point isn't 
> actually in the shape.
> Even if you knew this problem would never happen, I think you're better off 
> indexing center points into another spatial field if you want them.  Perhaps 
> arguably this strategy could do that internally?  Wether or not that ends up 
> happening, I just want to remove the problematic behavior now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4594) Spatial PrefixTreeStrategy shouldn't index center-points with shapes together

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528145#comment-13528145
 ] 

Commit Tag Bot commented on LUCENE-4594:


[branch_4x commit] David Wayne Smiley
http://svn.apache.org/viewvc?view=revision&revision=1419634

LUCENE-4594: PrefixTreeStrategy should not index center points


> Spatial PrefixTreeStrategy shouldn't index center-points with shapes together
> -
>
> Key: LUCENE-4594
> URL: https://issues.apache.org/jira/browse/LUCENE-4594
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial
>Affects Versions: 4.0, 5.0
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 4.1, 5.0
>
> Attachments: 
> LUCENE-4594__PrefixTreeStrategy_should_not_index_center_points.patch
>
>
> The Spatial PrefixTreeStrategy will index the center-point of a non-point 
> shape it is given to index, in addition to the shape itself of course.  The 
> rationale was that this point could be picked up by 
> PointPrefixTreeFieldCacheProvider for distance/sorting.  However this 
> approach is buggy since the distinction of grid cells between the center 
> point and the shape itself is lost when the shape gets indexed down to 
> max-levels precision -- each grid cell therein appears to be another point 
> that needs to be brought into memory.  It's also possible that the shape is a 
> LineString or some other non-trivial shape in which its center point isn't 
> actually in the shape.
> Even if you knew this problem would never happen, I think you're better off 
> indexing center points into another spatial field if you want them.  Perhaps 
> arguably this strategy could do that internally?  Wether or not that ends up 
> happening, I just want to remove the problematic behavior now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Robert Muir (JIRA)

Robert Muir created LUCENE-4612:
---

 Summary: ant nightly-smoke leaves a dirty checkout
 Key: LUCENE-4612
 URL: https://issues.apache.org/jira/browse/LUCENE-4612
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


?   dev-tools/scripts/__pycache__

Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528152#comment-13528152
 ] 

Hoss Man commented on LUCENE-4612:
--

i've been meaning to ask this since the 4.0 vote ... should we add that dir to 
svn:ignore?

> ant nightly-smoke leaves a dirty checkout
> -
>
> Key: LUCENE-4612
> URL: https://issues.apache.org/jira/browse/LUCENE-4612
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> ?   dev-tools/scripts/__pycache__
> Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528154#comment-13528154
 ] 

Commit Tag Bot commented on LUCENE-4611:


[trunk commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419644

LUCENE-4611: remove duplicate 3rd party versioning from build.xmls


> remove duplicate 3rd party versioning from build.xmls
> -
>
> Key: LUCENE-4611
> URL: https://issues.apache.org/jira/browse/LUCENE-4611
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Attachments: LUCENE-4611.patch
>
>
> At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
> didnt want to have this in the classpath because its just noisy.
> But nowadays we use an ivy sync=true, the only things in lib are the jars in 
> the ivy.xml.
> So we can just put lib/ dirs in the classpath and not have redundant jar 
> names also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528157#comment-13528157
 ] 

Robert Muir commented on LUCENE-4612:
-

I'm not happy with its location being in the source tree and not in a build/ 
directory.

ideally we would put this cache in the build/ directory. this way its actually 
cleaned by 'ant clean' and so on.

> ant nightly-smoke leaves a dirty checkout
> -
>
> Key: LUCENE-4612
> URL: https://issues.apache.org/jira/browse/LUCENE-4612
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> ?   dev-tools/scripts/__pycache__
> Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4611.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.1

> remove duplicate 3rd party versioning from build.xmls
> -
>
> Key: LUCENE-4611
> URL: https://issues.apache.org/jira/browse/LUCENE-4611
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4611.patch
>
>
> At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
> didnt want to have this in the classpath because its just noisy.
> But nowadays we use an ivy sync=true, the only things in lib are the jars in 
> the ivy.xml.
> So we can just put lib/ dirs in the classpath and not have redundant jar 
> names also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4136) SolrCloud bugs when servlet context contains "/" or "_"

2012-12-10 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4136:
---

Attachment: SOLR-4136.patch

patch updated to trunk

> SolrCloud bugs when servlet context contains "/" or "_"
> ---
>
> Key: SOLR-4136
> URL: https://issues.apache.org/jira/browse/SOLR-4136
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-4136.patch, SOLR-4136.patch, SOLR-4136.patch
>
>
> SolrCloud does not work properly with non-trivial values for "hostContext" 
> (ie: the servlet context path).  In particular...
> * Using a hostContext containing a  "/" (ie: a servlet context with a subdir 
> path, semi-common among people who organize webapps hierarchically for lod 
> blanacer rules) is explicitly forbidden in ZkController because of how the 
> hostContext is used to build a ZK nodeName
> * Using a hostContext containing a "\_" causes problems in 
> OverseerCollectionProcessor where it assumes all "\_" characters should be 
> converted to "/" to reconstitute a URL from nodeName (NOTE: this code 
> specifically has a TODO to fix this, and then has a subsequent TODO about 
> assuming "http://"; labeled "this sucks")

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4611) remove duplicate 3rd party versioning from build.xmls

2012-12-10 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528170#comment-13528170
 ] 

Commit Tag Bot commented on LUCENE-4611:


[branch_4x commit] Robert Muir
http://svn.apache.org/viewvc?view=revision&revision=1419656

LUCENE-4611: remove duplicate 3rd party versioning from build.xmls


> remove duplicate 3rd party versioning from build.xmls
> -
>
> Key: LUCENE-4611
> URL: https://issues.apache.org/jira/browse/LUCENE-4611
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/build
>Reporter: Robert Muir
> Fix For: 4.1, 5.0
>
> Attachments: LUCENE-4611.patch
>
>
> At first we had lots of stuff in lib/ including .sha1 files and so on. So we 
> didnt want to have this in the classpath because its just noisy.
> But nowadays we use an ivy sync=true, the only things in lib are the jars in 
> the ivy.xml.
> So we can just put lib/ dirs in the classpath and not have redundant jar 
> names also in build.xml: this makes it harder to upgrade and maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1419551 - /lucene/dev/trunk/solr/example/ivy.xml

2012-12-10 Thread Alan Woodward

Ha, cunning, been trying to work out how to dry up my ivy files for ages...

On 10 Dec 2012, at 16:13, Uwe Schindler wrote:

> Approved by the XML policeman :-)
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
>> -Original Message-
>> From: rm...@apache.org [mailto:rm...@apache.org]
>> Sent: Monday, December 10, 2012 5:12 PM
>> To: comm...@lucene.apache.org
>> Subject: svn commit: r1419551 - /lucene/dev/trunk/solr/example/ivy.xml
>> 
>> Author: rmuir
>> Date: Mon Dec 10 16:12:25 2012
>> New Revision: 1419551
>> 
>> URL: http://svn.apache.org/viewvc?rev=1419551&view=rev
>> Log:
>> declare jetty.version as entity to remove the 10x duplication (asking for
>> trouble upgrading in the future)
>> 
>> Modified:
>>lucene/dev/trunk/solr/example/ivy.xml
>> 
>> Modified: lucene/dev/trunk/solr/example/ivy.xml
>> URL:
>> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/ivy.xml?rev=
>> 1419551&r1=1419550&r2=1419551&view=diff
>> ==
>> 
>> --- lucene/dev/trunk/solr/example/ivy.xml (original)
>> +++ lucene/dev/trunk/solr/example/ivy.xml Mon Dec 10 16:12:25 2012
>> @@ -16,6 +16,9 @@
>>specific language governing permissions and limitations
>>under the License.
>> -->
>> +> +   ]>
>> 
>> 
>> 
>> @@ -25,18 +28,18 @@
>> 
>> 
>> 
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="jetty->default"/>
>> -  > rev="8.1.8.v20121106" transitive="false" conf="start->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > rev="&jetty.version;" transitive="false" conf="jetty->default"/>
>> +  > + rev="&jetty.version;" transitive="false" conf="start->default"/>
>>   > rev="3.0.0.v201112011016" transitive="false" conf="servlet->default">
>> 
>>   
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528178#comment-13528178
 ] 

Hoss Man commented on LUCENE-4612:
--

my limited python understanding is that you can't tell python to use a 
different dir for this, so if you want it in build you'd have to use the build 
dir as the working dir when running the script? (or maybe copy the script 
there?)

or we could just python not to cache the bytecode...

http://docs.python.org/2/using/cmdline.html#cmdoption-B
http://docs.python.org/2/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE

...but it still seems like maybe we should svn:ignore that dir to handle the 
case where people run those scripts manually

> ant nightly-smoke leaves a dirty checkout
> -
>
> Key: LUCENE-4612
> URL: https://issues.apache.org/jira/browse/LUCENE-4612
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> ?   dev-tools/scripts/__pycache__
> Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528181#comment-13528181
 ] 

Robert Muir commented on LUCENE-4612:
-

i don't like adding svn:ignores for things like this (in this case, really, its 
just like having an output directory full of java classes).

for now i'm testing just running the py script with a CWD in the build 
directory:

{noformat}
Index: build.xml
===
--- build.xml   (revision 1419557)
+++ build.xml   (working copy)
@@ -276,7 +276,7 @@
  


-   
+   
  
  
  
{noformat}

Unfortunately it takes 45 minutes to know if it works :)

if this works i would recommend we just add an ant task for manual smoking too 
just to keep everything clean.

> ant nightly-smoke leaves a dirty checkout
> -
>
> Key: LUCENE-4612
> URL: https://issues.apache.org/jira/browse/LUCENE-4612
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> ?   dev-tools/scripts/__pycache__
> Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-10 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528182#comment-13528182
 ] 

Yonik Seeley commented on SOLR-4144:


I bet this could be due to NRTCachingDirectory?  It makes the decision to cache 
a file or not up-front and can't change when it's part-way through the file.

If there's no mergeInfo or flushInfo in the context (and the file isn't the 
segments file) then it will chose to cache the file.
We need to pass something (like flushInfo) that will convince it not to cache.  
I'll work up a patch...

> SolrCloud replication high heap consumption
> ---
>
> Key: SOLR-4144
> URL: https://issues.apache.org/jira/browse/SOLR-4144
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 5.0
> Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
> 14:09:13
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
>
> Recent versions of SolrCloud require a very high heap size vs. older 
> versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
> restore an empty node without taking a lot of heap (xmx=256m). Recent 
> versions and current trunk fail miserably even with a higher heap (750m). 
> Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
> is that the cluster on which this fails has only about 1.5GB per core due to 
> changing in the Lucene codec such as compression.
> After start up everything goes fine...
> {code}
> 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
> - : Begin buffering updates. core=shard_c
> 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
> - : Begin buffering updates. core=shard_b
> 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
> Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
> 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
> Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
> 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
> - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
> core=shard_b
> 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
> - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
> core=shard_c
> 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
> : Creating new http client, 
> config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
> : Creating new http client, 
> config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
> [RecoveryThread] - : Commits will be reserved for  1
> 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
> [RecoveryThread] - : Commits will be reserved for  1
> 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
> : Creating new http client, 
> config:connTimeout=5000&socketTimeout=2&allowCompression=false&maxConnections=1&maxConnectionsPerHost=1
> 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
>  No value set for 'pollInterval'. Timer Task not started.
> 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
>  No value set for 'pollInterval'. Timer Task not started.
> 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Master's generation: 48
> 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Slave's generation: 1
> 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Starting replication process
> 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Master's generation: 47
> 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Slave's generation: 1
> 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Starting replication process
> 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Number of files in latest index in master: 235
> 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
> Number of files in latest index in master: 287
> 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
> [RecoveryThread] - : No lockType configured for 
> NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
>  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
> maxCacheMB=48

[jira] [Created] (LUCENE-4613) CompressingStoredFieldsWriter ignores the segment suffix if writing aborted

2012-12-10 Thread Renaud Delbru (JIRA)

Renaud Delbru created LUCENE-4613:
-

 Summary: CompressingStoredFieldsWriter ignores the segment suffix 
if writing aborted
 Key: LUCENE-4613
 URL: https://issues.apache.org/jira/browse/LUCENE-4613
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.1
Reporter: Renaud Delbru
 Fix For: 4.1


If the writing is aborted, CompressingStoredFieldsWriter does not remove 
partially-written files as the segment suffix is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4613) CompressingStoredFieldsWriter ignores the segment suffix if writing aborted

2012-12-10 Thread Renaud Delbru (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renaud Delbru updated LUCENE-4613:
--

Attachment: LUCENE-4613.patch

Fix bug introduced by LUCENE-4591

> CompressingStoredFieldsWriter ignores the segment suffix if writing aborted
> ---
>
> Key: LUCENE-4613
> URL: https://issues.apache.org/jira/browse/LUCENE-4613
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/codecs
>Affects Versions: 4.1
>Reporter: Renaud Delbru
> Fix For: 4.1
>
> Attachments: LUCENE-4613.patch
>
>
> If the writing is aborted, CompressingStoredFieldsWriter does not remove 
> partially-written files as the segment suffix is not taken into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4612) ant nightly-smoke leaves a dirty checkout

2012-12-10 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528191#comment-13528191
 ] 

Robert Muir commented on LUCENE-4612:
-

fakeReleaseTmp wont work, it doesnt yet exist at the time the script runs. if 
we make it, the script complains.
trying another place in build/ 

> ant nightly-smoke leaves a dirty checkout
> -
>
> Key: LUCENE-4612
> URL: https://issues.apache.org/jira/browse/LUCENE-4612
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> ?   dev-tools/scripts/__pycache__
> Can we not leave this around?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4161) deadlock in TestReplicationHandler

2012-12-10 Thread Hoss Man (JIRA)

Hoss Man created SOLR-4161:
--

 Summary: deadlock in TestReplicationHandler
 Key: SOLR-4161
 URL: https://issues.apache.org/jira/browse/SOLR-4161
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


while testing out another patch i noticed "stalled" heartbeat messages getting 
logged by TestReplicationHandler.test and started taking some stack traces to 
see if it was in the code i was working on.

it's not, so i suspect it's unrelated to the changes i'm looking at, but i did 
notice that there was a full on deadlock reported, so i wanted to make sure it 
got tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4161) deadlock in TestReplicationHandler

2012-12-10 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4161:
---

Attachment: dump5.txt
dump4.txt
dump3.txt
dump2.txt

multiple dump files taken a few seconds apart ... i haven't looked at them in a 
lot of depth, but but given the deadlock state i suspect there aren't' a lot of 
differneces.

> deadlock in TestReplicationHandler
> --
>
> Key: SOLR-4161
> URL: https://issues.apache.org/jira/browse/SOLR-4161
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
> Attachments: dump2.txt, dump3.txt, dump4.txt, dump5.txt
>
>
> while testing out another patch i noticed "stalled" heartbeat messages 
> getting logged by TestReplicationHandler.test and started taking some stack 
> traces to see if it was in the code i was working on.
> it's not, so i suspect it's unrelated to the changes i'm looking at, but i 
> did notice that there was a full on deadlock reported, so i wanted to make 
> sure it got tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4614) Create dev-tools/eclipse/dot.classpath automatically

2012-12-10 Thread Uwe Schindler (JIRA)

Uwe Schindler created LUCENE-4614:
-

 Summary: Create dev-tools/eclipse/dot.classpath automatically
 Key: LUCENE-4614
 URL: https://issues.apache.org/jira/browse/LUCENE-4614
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 4.1, 5.0


It is a pain to keep the file up-to-date. As it is pure XML we can use a temple 
to produce it automatically. The same trikc like for creating index.html in the 
docs is used.

The patch will produce it automatically from filesets/dirsets in ant. It is 
still a pain with the duplicate JARs, but maybe we can fix that later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 3 >

1 - 100 of 215 matches

Mail list logo