Re: Welcome Benson Margulies as Lucene/Solr committer!

2014-02-03 Thread David Smiley (@MITRE.org)
Awesome to have another committer, and in my neck of the woods too. Welcome!




-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Welcome-Benson-Margulies-as-Lucene-Solr-committer-tp4113502p4115165.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Where to add custom mbean

2014-02-03 Thread Otis Gospodnetic
Hi Greg,

This sounds overly complex to me.  Extending a RHB to aid you in monitoring
doesn't feel right.  Have you considered using monitoring tools that can
provide you with aggregated views and such?  Have a look at
http://sematext.com/spm , which can do that for you and much more without
you having to hack Solr.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Feb 3, 2014 at 3:19 PM, Greg W  wrote:

> I've written a custom mbean that aggregates data from all the
> RequestHandler mbeans in a jvm to provide aggregate statistics for easier
> monitoring and currently I'm ensuring it gets ran by actually extending
> RequestHandlerBase and including the class as a request handler in
> solrconfig.xml. I don't think this is the ideal way of getting this code to
> run but as a quick hack it got the job done. If I wanted to ensure this
> class ran / register the mbean at a more appropriate place, earlier on in
> solr's initialization, where would that be?
>
> Thanks,
> Greg
>


[jira] [Commented] (LUCENE-5425) Make creation of FixedBitSet in FacetsCollector overridable

2014-02-03 Thread Lei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890040#comment-13890040
 ] 

Lei Wang commented on LUCENE-5425:
--

tried with the lucenutil, but got some problem. I cannot get same numbers for 
two identical code of trunk. even if they are all trunks, i get different 
numbers:
Report after iter 19:
TaskQPS baseline  StdDevQPS my_modified_version  
StdDevPct diff
   OrHighMed   74.15  (7.1%)   71.24  (8.3%)   
-3.9% ( -18% -   12%)
 LowTerm  515.68 (15.1%)  496.20 (12.3%)   
-3.8% ( -27% -   27%)
OrNotHighLow   72.22  (8.2%)   70.36  (7.6%)   
-2.6% ( -17% -   14%)
OrNotHighMed   79.01  (7.3%)   77.43  (8.4%)   
-2.0% ( -16% -   14%)
   OrHighNotHigh   38.66  (4.5%)   37.90  (6.4%)   
-2.0% ( -12% -9%)
 Respell   51.21  (7.1%)   50.23  (6.5%)   
-1.9% ( -14% -   12%)
   MedPhrase   69.67  (7.5%)   68.35  (7.4%)   
-1.9% ( -15% -   14%)
   OrHighLow   67.24  (7.8%)   66.00  (9.0%)   
-1.8% ( -17% -   16%)
  Fuzzy1   27.37  (5.7%)   26.96  (5.5%)   
-1.5% ( -11% -   10%)
  Fuzzy2   37.21  (3.8%)   36.71  (5.6%)   
-1.3% ( -10% -8%)
 MedSloppyPhrase9.94  (5.4%)9.83  (3.9%)   
-1.1% (  -9% -8%)
 LowSpanNear8.60  (3.9%)8.54  (3.8%)   
-0.7% (  -8% -7%)
 AndHighHigh   40.23  (3.1%)   40.03  (2.5%)   
-0.5% (  -5% -5%)
HighTerm   76.07  (9.0%)   75.96  (9.1%)   
-0.2% ( -16% -   19%)
  OrHighHigh   11.62  (3.0%)   11.62  (4.8%)   
-0.1% (  -7% -7%)
  IntNRQ9.51  (3.9%)9.51  (8.3%)
0.0% ( -11% -   12%)
  HighPhrase   25.61  (7.0%)   25.63  (7.7%)
0.1% ( -13% -   15%)
 LowSloppyPhrase   30.21  (5.2%)   30.24  (4.3%)
0.1% (  -8% -   10%)
PKLookup  212.03  (9.0%)  212.25 (11.5%)
0.1% ( -18% -   22%)
   OrNotHighHigh   27.75  (3.5%)   27.80  (6.5%)
0.2% (  -9% -   10%)
OrHighNotMed   58.14  (5.9%)   58.27  (8.3%)
0.2% ( -13% -   15%)
 MedSpanNear   22.73  (3.7%)   22.80  (5.1%)
0.3% (  -8% -9%)
Wildcard   42.84  (5.0%)   42.97  (5.4%)
0.3% (  -9% -   11%)
HighSloppyPhrase   23.99  (7.4%)   24.08  (6.3%)
0.4% ( -12% -   15%)
  AndHighLow  625.62  (6.6%)  629.52 (10.5%)
0.6% ( -15% -   18%)
 Prefix3   77.68  (7.2%)   78.17  (6.2%)
0.6% ( -11% -   15%)
   LowPhrase   14.58  (4.7%)   14.77  (5.0%)
1.3% (  -8% -   11%)
HighSpanNear   11.84  (4.3%)   11.99  (5.2%)
1.3% (  -7% -   11%)
OrHighNotLow   66.04  (8.4%)   67.28  (9.2%)
1.9% ( -14% -   21%)
  AndHighMed   66.55  (4.3%)   67.91  (6.2%)
2.1% (  -8% -   13%)
 MedTerm  139.78  (9.5%)  145.63 (10.3%)
4.2% ( -14% -   26%)

with the patch, the numbers are also different, but no bigger difference than 
the trunk-trunk numbers:
Report after iter 19:
TaskQPS baseline  StdDevQPS my_modified_version  
StdDevPct diff
  AndHighLow  730.30 (11.5%)  700.95 (10.6%)   
-4.0% ( -23% -   20%)
 LowTerm  520.94 (10.6%)  504.25 (11.4%)   
-3.2% ( -22% -   21%)
  Fuzzy1   57.55  (5.1%)   56.26  (4.8%)   
-2.2% ( -11% -8%)
 Respell   35.85  (4.7%)   35.18  (4.1%)   
-1.9% ( -10% -7%)
   OrHighNotHigh   37.77  (7.3%)   37.19  (5.9%)   
-1.5% ( -13% -   12%)
HighSloppyPhrase   12.30  (7.5%)   12.17  (7.7%)   
-1.1% ( -15% -   15%)
  HighPhrase   29.38  (5.2%)   29.06  (4.3%)   
-1.1% ( -10% -8%)
OrNotHighMed   25.93  (6.2%)   25.68  (5.5%)   
-1.0% ( -11% -   11%)
   OrNotHighHigh   19.72  (5.9%)   19.53  (4.9%)   
-0.9% ( -11% -   10%)
  Fuzzy2   11.30  (3.6%)   11.24  (5.1%)   
-0.6% (  -8% -8%)
PKLookup  218.16  (8.6%)  217.53  (9.3%)   
-0.3% ( -16% -   19%)
 LowSloppyPhrase   43.09  (5.6%)   43.00  (3.5%)   
-0.2% (  -8% -9%)

[jira] [Commented] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry

2014-02-03 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889916#comment-13889916
 ] 

Paul Elschot commented on LUCENE-5432:
--

See https://github.com/apache/lucene-solr/pull/28

> EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of 
> max index entry
> --
>
> Key: LUCENE-5432
> URL: https://issues.apache.org/jira/browse/LUCENE-5432
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Reporter: Paul Elschot
>Priority: Minor
> Fix For: 5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



lucene-solr pull request: Correct number of bits for an index entry.

2014-02-03 Thread PaulElschot
GitHub user PaulElschot opened a pull request:

https://github.com/apache/lucene-solr/pull/28

Correct number of bits for an index entry.

This will only occur when the maximum index is entry is a power of 2.
Currently there is no failing test, but it is easy to see from the code.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PaulElschot/lucene-solr ef-indexbits-bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/28.patch


commit a768fe071df7a62f5e4f087bac705fb4089bf013
Author: Paul Elschot 
Date:   2014-02-03T20:25:58Z

Correct number of bits for an index entry. Would fail for powers of 2 only.




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry

2014-02-03 Thread Paul Elschot (JIRA)
Paul Elschot created LUCENE-5432:


 Summary: EliasFanoEncoder number of index entry bits is off by 1 
for powers of 2 of max index entry
 Key: LUCENE-5432
 URL: https://issues.apache.org/jira/browse/LUCENE-5432
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Reporter: Paul Elschot
Priority: Minor
 Fix For: 5.0






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5416) Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()

2014-02-03 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889910#comment-13889910
 ] 

Paul Elschot commented on LUCENE-5416:
--

I would normally expect that the performance is independent of the presence of 
the advanceToJustBefore() method of DocBlockIterator.
This method happens to be there because I needed it for LUCENE-5092, and I did 
not bother to remove it for the performance measurements.
I'm sorry for the confusion about this.

The FixedBitSetDBI here does not always make nextDoc() faster, in fact (for me) 
the Long.numberOfTrailingZeros() implementation of nextDoc() in FixedBitSetDBI 
here is up to 5 times slower for load factors above 0.25. Below that the 
nextDoc() here is up to 2.5 times faster.

The idea is that using  Long.numberOfTrailingZeros() appears to be faster for 
advance(), and also for nextDoc()  up to a load factor of about 0.25 .

Wasn't OpenBitSetIterator made before Long.numberOfTrailingZeros() was 
available/intrinsified?


> Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()
> ---
>
> Key: LUCENE-5416
> URL: https://issues.apache.org/jira/browse/LUCENE-5416
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Paul Elschot
>Priority: Minor
> Fix For: 5.0
>
>
> On my machine the current byte index used in OpenBitSetIterator is slower 
> than Long.numberOfTrailingZeros() for advance().
> The pull request contains the code for benchmarking this taken from an early 
> stage of DocBlocksIterator.
> In case the benchmark shows improvements on more machines, well, we know what 
> to do...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Where to add custom mbean

2014-02-03 Thread Greg W
I've written a custom mbean that aggregates data from all the
RequestHandler mbeans in a jvm to provide aggregate statistics for easier
monitoring and currently I'm ensuring it gets ran by actually extending
RequestHandlerBase and including the class as a request handler in
solrconfig.xml. I don't think this is the ideal way of getting this code to
run but as a quick hack it got the job done. If I wanted to ensure this
class ran / register the mbean at a more appropriate place, earlier on in
solr's initialization, where would that be?

Thanks,
Greg


[jira] [Updated] (SOLR-5647) The example in example-schemaless doesn't load libs properly

2014-02-03 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-5647:
---

Attachment: SOLR-5647.patch

Patch with proper attribution.  Will commit after checking tests and precommit.


> The example in example-schemaless doesn't load libs properly
> 
>
> Key: SOLR-5647
> URL: https://issues.apache.org/jira/browse/SOLR-5647
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>  Labels: patch
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5647.patch, SOLR-5647.patch, SOLR-5647.patch, 
> SOLR-5647.patch
>
>
> When starting the example with example-schemaless, all the "lib" directives 
> in the config aren't working, because they are missing one "../" instance.
> Noticed by IRC user Pilate.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2245) MailEntityProcessor Update

2014-02-03 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-2245:
---

Assignee: Shalin Shekhar Mangar

> MailEntityProcessor Update
> --
>
> Key: SOLR-2245
> URL: https://issues.apache.org/jira/browse/SOLR-2245
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4, 1.4.1
>Reporter: Peter Sturge
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.7
>
> Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch, 
> SOLR-2245.zip
>
>
> This patch addresses a number of issues in the MailEntityProcessor 
> contrib-extras module.
> The changes are outlined here:
> * Added an 'includeContent' entity attribute to allow specifying content to 
> be included independently of processing attachments
>  e.g.  
> would include message content, but not attachment content
> * Added a synonym called 'processAttachments', which is synonymous to the 
> mis-spelled (and singular) 'processAttachement' property. This property 
> functions the same as processAttachement. Default= 'true' - if either is 
> false, then attachments are not processed. Note that only one of these should 
> really be specified in a given  tag.
> * Added a FLAGS.NONE value, so that if an email has no flags (i.e. it is 
> unread, not deleted etc.), there is still a property value stored in the 
> 'flags' field (the value is the string "none")
> Note: there is a potential backward compat issue with FLAGS.NONE for clients 
> that expect the absence of the 'flags' field to mean 'Not read'. I'm 
> calculating this would be extremely rare, and is inadviasable in any case as 
> user flags can be arbitrarily set, so fixing it up now will ensure future 
> client access will be consistent.
> * The folder name of an email is now included as a field called 'folder' 
> (e.g. folder=INBOX.Sent). This is quite handy in search/post-indexing 
> processing
> * The addPartToDocument() method that processes attachments is significantly 
> re-written, as there looked to be no real way the existing code would ever 
> actually process attachment content and add it to the row data
> Tested on the 3.x trunk with a number of popular imap servers.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5688) Allow updating of soft and hard commit parameters using HTTP API

2014-02-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafał Kuć updated SOLR-5688:


Attachment: SOLR-5688-single_api_call.patch

A new patch attached. The handler now allows to set the four values at ones, so 
no need for separate calls. 

> Allow updating of soft and hard commit parameters using HTTP API
> 
>
> Key: SOLR-5688
> URL: https://issues.apache.org/jira/browse/SOLR-5688
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.6.1
>Reporter: Rafał Kuć
> Fix For: 5.0
>
> Attachments: SOLR-5688-single_api_call.patch, SOLR-5688.patch
>
>
> Right now, to update the values (max time and max docs) for hard and soft 
> autocommits one has to alter the configuration and reload the core. I think 
> it may be nice, to expose an API to do that in a way, that the configuration 
> is not updated, so the change is not persistent. 
> There may be various reasons for doing that - for example one may know that 
> the application will send large amount of data and want to prepare for that. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5689) On reconnect, ZkController cancels election on first context rather than latest

2014-02-03 Thread Daniel Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889607#comment-13889607
 ] 

Daniel Collins commented on SOLR-5689:
--

DOH, my bad, missed that line, too used to expecting whitespace line between 
bracket and first code statement, must be a bug in my brain's Java parser.

> On reconnect, ZkController cancels election on first context rather than 
> latest
> ---
>
> Key: SOLR-5689
> URL: https://issues.apache.org/jira/browse/SOLR-5689
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 5.0, 4.7
>Reporter: Gregory Chanan
>
> I haven't tested this yet, so I could be wrong, but this is my reading of the 
> code:
> During init:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient, overseer, 
> getNodeName());
> overseerElector.setup(context);
> overseerElector.joinElection(context, false);
> {code}
> On reconnect:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient,overseer, 
> getNodeName());
>   
> ElectionContext prevContext = overseerElector.getContext();
> if (prevContext != null) {
>   prevContext.cancelElection();
> }
>   
> overseerElector.joinElection(context, true);
> {code}
> setup doesn't appear to be called on reconnect, so the new context is never 
> set and the first context gets cancelled over and over.
> A call to overseerElector.setup(context); before joinElection in the 
> reconnect case would address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5425) Make creation of FixedBitSet in FacetsCollector overridable

2014-02-03 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889604#comment-13889604
 ] 

John Wang commented on LUCENE-5425:
---

I named it to newHitSet in case we do decide to move to an abstract doc set, we 
wouldn't need to (and forget to) change the method name.

> Make creation of FixedBitSet in FacetsCollector overridable
> ---
>
> Key: LUCENE-5425
> URL: https://issues.apache.org/jira/browse/LUCENE-5425
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 4.6
>Reporter: John Wang
> Attachments: facetscollector.patch, facetscollector.patch, 
> fixbitset.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query. 
> For large indexes where maxDocs are large creating a bitset of maxDoc bits 
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining 
> current behavior.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b124) - Build # 9242 - Still Failing!

2014-02-03 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9242/
Java: 32bit/jdk1.8.0-ea-b124 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 49889 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:459: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:398: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:185: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* 
./lucene/core/src/test/org/apache/lucene/index/TestDocInverterPerFieldErrorInfo.java

Total time: 51 minutes 0 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.8.0-ea-b124 -server -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5416) Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()

2014-02-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889572#comment-13889572
 ] 

Michael McCandless commented on LUCENE-5416:


bq. Would it be correct to conclude from that and from the measurements here 
that faceting involves nextDoc() on high bit densities?

Yes, this is typically the "hardest" case for faceting, so if we have some 
ideas on how to make that iteration faster, that would be great.  But I can't 
tell here what the idea is?  Can we somehow separate it out from the 
DocBlockIterator?

> Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()
> ---
>
> Key: LUCENE-5416
> URL: https://issues.apache.org/jira/browse/LUCENE-5416
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Paul Elschot
>Priority: Minor
> Fix For: 5.0
>
>
> On my machine the current byte index used in OpenBitSetIterator is slower 
> than Long.numberOfTrailingZeros() for advance().
> The pull request contains the code for benchmarking this taken from an early 
> stage of DocBlocksIterator.
> In case the benchmark shows improvements on more machines, well, we know what 
> to do...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5689) On reconnect, ZkController cancels election on first context rather than latest

2014-02-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889551#comment-13889551
 ] 

Mark Miller commented on SOLR-5689:
---

It also sets the latest context on the elector though - which we want to make 
sure is always the latest so that if for some reason we are asked to join the 
election again and are already participating, we cancel our participation first.

> On reconnect, ZkController cancels election on first context rather than 
> latest
> ---
>
> Key: SOLR-5689
> URL: https://issues.apache.org/jira/browse/SOLR-5689
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 5.0, 4.7
>Reporter: Gregory Chanan
>
> I haven't tested this yet, so I could be wrong, but this is my reading of the 
> code:
> During init:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient, overseer, 
> getNodeName());
> overseerElector.setup(context);
> overseerElector.joinElection(context, false);
> {code}
> On reconnect:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient,overseer, 
> getNodeName());
>   
> ElectionContext prevContext = overseerElector.getContext();
> if (prevContext != null) {
>   prevContext.cancelElection();
> }
>   
> overseerElector.joinElection(context, true);
> {code}
> setup doesn't appear to be called on reconnect, so the new context is never 
> set and the first context gets cancelled over and over.
> A call to overseerElector.setup(context); before joinElection in the 
> reconnect case would address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5431) Add FSLockFactory.toString()

2014-02-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889547#comment-13889547
 ] 

Michael McCandless commented on LUCENE-5431:


+1 for patch and to just use .getSimpleName() in the toString.

> Add FSLockFactory.toString()
> 
>
> Key: LUCENE-5431
> URL: https://issues.apache.org/jira/browse/LUCENE-5431
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Attachments: LUCENE-5431.patch
>
>
> FSLockFactory doesn't override toString, which causes Dir.toString() to print 
> the class.name@instance. I think it would be better if it printed e.g. the 
> lockDir.
> I added it but TestCrashCausesCorruptIndex failed because it declares a 
> Directory which doesn't override getLockID(), which returns toString(). I 
> changed that Directory to extend FilterDirectory, and fixed FilterDirectory 
> to override getLockID() to call in.getLockID().
> Will attach a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_60-ea-b03) - Build # 9241 - Failure!

2014-02-03 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9241/
Java: 32bit/jdk1.7.0_60-ea-b03 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 50554 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:459: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:398: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:185: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* 
./lucene/core/src/test/org/apache/lucene/index/TestDocInverterPerFieldErrorInfo.java

Total time: 59 minutes 44 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.7.0_60-ea-b03 -client -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5610) Support cluster-wide properties with an API called CLUSTERPROP

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889469#comment-13889469
 ] 

ASF subversion and git services commented on SOLR-5610:
---

Commit 1563886 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1563886 ]

SOLR-5610 New Collection API called CLUSTERPROP

> Support cluster-wide properties with an API called CLUSTERPROP
> --
>
> Key: SOLR-5610
> URL: https://issues.apache.org/jira/browse/SOLR-5610
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-5610.patch
>
>
> Add a collection admin API for cluster wide property management
> the new API would create an entry in the root as 
> /cluster-props.json
> {code:javascript}
> {
> "prop":val"
> }
> {code}
> The API would work as
> /command=clusterprop&name=propName&value=propVal
> there will be a set of well-known properties which can be set or unset with 
> this command



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5423) CSV output doesn't include function field

2014-02-03 Thread Arun Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Kumar updated SOLR-5423:
-

Attachment: CSVResponseWriter.java.patch

Fix for the csv output doesn't work for function fields

> CSV output doesn't include function field
> -
>
> Key: SOLR-5423
> URL: https://issues.apache.org/jira/browse/SOLR-5423
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.4
>Reporter: James Wilson
> Attachments: CSVResponseWriter.java.patch
>
>
> Given a schema with 
>
>
>   
> the following query returns no rows:
> http://localhost:8983/solr/collection1/select?q=*%3A*&rows=30&fl=div(price%2Cnumpages)&wt=csv&indent=true
> However, setting wt=json or wt=xml, it works.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5423) CSV output doesn't include function field

2014-02-03 Thread Arun Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889471#comment-13889471
 ] 

Arun Kumar commented on SOLR-5423:
--

I have fixed this issue. Attached the patch file for the fix.

> CSV output doesn't include function field
> -
>
> Key: SOLR-5423
> URL: https://issues.apache.org/jira/browse/SOLR-5423
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.4
>Reporter: James Wilson
> Attachments: CSVResponseWriter.java.patch
>
>
> Given a schema with 
>
>
>   
> the following query returns no rows:
> http://localhost:8983/solr/collection1/select?q=*%3A*&rows=30&fl=div(price%2Cnumpages)&wt=csv&indent=true
> However, setting wt=json or wt=xml, it works.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5610) Support cluster-wide properties with an API called CLUSTERPROP

2014-02-03 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-5610.
--

Resolution: Fixed

> Support cluster-wide properties with an API called CLUSTERPROP
> --
>
> Key: SOLR-5610
> URL: https://issues.apache.org/jira/browse/SOLR-5610
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-5610.patch
>
>
> Add a collection admin API for cluster wide property management
> the new API would create an entry in the root as 
> /cluster-props.json
> {code:javascript}
> {
> "prop":val"
> }
> {code}
> The API would work as
> /command=clusterprop&name=propName&value=propVal
> there will be a set of well-known properties which can be set or unset with 
> this command



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5610) Support cluster-wide properties with an API called CLUSTERPROP

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889460#comment-13889460
 ] 

ASF subversion and git services commented on SOLR-5610:
---

Commit 1563876 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1563876 ]

SOLR-5610 New Collectio API called CLUSTERPROP

> Support cluster-wide properties with an API called CLUSTERPROP
> --
>
> Key: SOLR-5610
> URL: https://issues.apache.org/jira/browse/SOLR-5610
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-5610.patch
>
>
> Add a collection admin API for cluster wide property management
> the new API would create an entry in the root as 
> /cluster-props.json
> {code:javascript}
> {
> "prop":val"
> }
> {code}
> The API would work as
> /command=clusterprop&name=propName&value=propVal
> there will be a set of well-known properties which can be set or unset with 
> this command



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5690) Null pointerException in AbstractStatsValues.accumulate

2014-02-03 Thread Elran Dvir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889459#comment-13889459
 ] 

Elran Dvir commented on SOLR-5690:
--

Patch resolving the issue is attached.

> Null pointerException in AbstractStatsValues.accumulate
> ---
>
> Key: SOLR-5690
> URL: https://issues.apache.org/jira/browse/SOLR-5690
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Elran Dvir
>Priority: Minor
> Attachments: SOLR-5690.patch
>
>
> It happens when there is a string field with docValues="true" and default="".
> Then, with documents that have empty string value in the field,
> values.exists(docID) is true but values.strVal(docID) is null, and it throws 
> null pointer exception when trying to add the value to distinctValues set.
> the solr query is stats=true&stats.field=X&stats.calcdistinct=true
> stack trace:
> java.lang.NullPointerException at java.util.TreeMap.put(TreeMap.java:567) at 
> java.util.TreeSet.add(TreeSet.java:266) at 
> org.apache.solr.handler.component.AbstractStatsValues.accumulate(StatsValuesFactory.java:164)
>  at 
> org.apache.solr.handler.component.StringStatsValues.accumulate(StatsValuesFactory.java:535)
>  at 
> org.apache.solr.handler.component.SimpleStats.getFieldCacheStats(StatsComponent.java:274)
>  at 
> org.apache.solr.handler.component.SimpleStats.getStatsFields(StatsComponent.java:225)
>  at 
> org.apache.solr.handler.component.SimpleStats.getStatsCounts(StatsComponent.java:200)
>  at 
> org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:68)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
>  at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) 
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
>  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) 
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>  at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>  at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>  at org.eclipse.jetty.server.Server.handle(Server.java:370) at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
>  at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at 
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at 
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>  at 
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>  at java.lang.Thread.run(Thread.java:804)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5690) Null pointerException in AbstractStatsValues.accumulate

2014-02-03 Thread Elran Dvir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elran Dvir updated SOLR-5690:
-

Attachment: SOLR-5690.patch

> Null pointerException in AbstractStatsValues.accumulate
> ---
>
> Key: SOLR-5690
> URL: https://issues.apache.org/jira/browse/SOLR-5690
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 4.7
>Reporter: Elran Dvir
>Priority: Minor
> Attachments: SOLR-5690.patch
>
>
> It happens when there is a string field with docValues="true" and default="".
> Then, with documents that have empty string value in the field,
> values.exists(docID) is true but values.strVal(docID) is null, and it throws 
> null pointer exception when trying to add the value to distinctValues set.
> the solr query is stats=true&stats.field=X&stats.calcdistinct=true
> stack trace:
> java.lang.NullPointerException at java.util.TreeMap.put(TreeMap.java:567) at 
> java.util.TreeSet.add(TreeSet.java:266) at 
> org.apache.solr.handler.component.AbstractStatsValues.accumulate(StatsValuesFactory.java:164)
>  at 
> org.apache.solr.handler.component.StringStatsValues.accumulate(StatsValuesFactory.java:535)
>  at 
> org.apache.solr.handler.component.SimpleStats.getFieldCacheStats(StatsComponent.java:274)
>  at 
> org.apache.solr.handler.component.SimpleStats.getStatsFields(StatsComponent.java:225)
>  at 
> org.apache.solr.handler.component.SimpleStats.getStatsCounts(StatsComponent.java:200)
>  at 
> org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:68)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
>  at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
>  at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) 
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
>  at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) 
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>  at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
>  at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) 
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>  at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>  at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>  at org.eclipse.jetty.server.Server.handle(Server.java:370) at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
>  at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
>  at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at 
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at 
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>  at 
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>  at java.lang.Thread.run(Thread.java:804)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Benson Margulies as Lucene/Solr committer!

2014-02-03 Thread Adrien Grand
Welcome, Benson!

On Sat, Jan 25, 2014 at 11:10 PM, Benson Margulies
 wrote:
> Hello Lucene development community, it's a pleasure to be welcomed aboard.
>
> In my view, the significant aspect of my bio is that I've been
> implementing things that go into or around Lucene for many years now.
> During the 'day', I'm the CTO of a company that works in the area of
> text analytics. We build Tokenizers and TokenFilters to allow our
> users to integrate our components into Lucene, and we've used Lucene
> and Solr as components of NLP devices that search on a large scale. So
> I have an abiding interest in the analysis chain and in the
> intersection of NLP and search.
>
> Elsewhere in Apache, I'm an active Maven dev, a semi-retired CXF dev,
> and a sort of uncle of several other projects. So I'm prone to be
> helpful or annoying with issues of Maven and Web Services.
>
> Thanks again, benson
>
> p.s. I think Uwe has already added me to the necessary wiring; would
> some kind soul please point me to the explanation of how the web site
> is maintained so I can add myself? Is it just the ASF CMS?
>
>
>
>
>
>
> On Sat, Jan 25, 2014 at 4:40 PM, Michael McCandless
>  wrote:
>> I'm pleased to announce that Benson Margulies has accepted to join our
>> ranks as a committer.
>>
>> Benson has been involved in a number of Lucene/Solr issues over time
>> (see 
>> http://jirasearch.mikemccandless.com/search.py?index=jira&chg=dds&a1=allUsers&a2=Benson+Margulies
>> ), most recently on debugging tricky analysis issues.
>>
>> Benson, it is tradition that you introduce yourself with a brief bio.
>> I know you're heavily involved in other Apache projects already...
>>
>> Once your account is set up, you should then be able to add yourself
>> to the who we are page on the website as well.
>>
>> Congratulations and welcome!
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>



-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4654) Integrate Lucene's sorting and early query termination capabilities into Solr

2014-02-03 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated SOLR-4654:
---

Labels: gsoc2014  (was: )

> Integrate Lucene's sorting and early query termination capabilities into Solr
> -
>
> Key: SOLR-4654
> URL: https://issues.apache.org/jira/browse/SOLR-4654
> Project: Solr
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Trivial
>  Labels: gsoc2014
>
> I think there would be some interesting work to do to integrate Lucene's 
> sorting and early query termination capabilities into Solr, in particular 
> (just ideas, maybe they're not all interesting/useful):
>  - configuring a SortingMergePolicy,
>  - figuring out when the sort order of queries matches the sort order of the 
> index segments,
>  - giving the ability to get approximated results when the query is not 
> sorted but only boosted by the sort order of the index,
>  - integration with TimeLimitingCollector: maybe it's better to collect only 
> half of all segments than to fully collect half of the segments,
>  - approximation of the number of matches based on the ratio of collected 
> documents,
>  - ...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5690) Null pointerException in AbstractStatsValues.accumulate

2014-02-03 Thread Elran Dvir (JIRA)
Elran Dvir created SOLR-5690:


 Summary: Null pointerException in AbstractStatsValues.accumulate
 Key: SOLR-5690
 URL: https://issues.apache.org/jira/browse/SOLR-5690
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0, 4.7
Reporter: Elran Dvir
Priority: Minor


It happens when there is a string field with docValues="true" and default="".
Then, with documents that have empty string value in the field,
values.exists(docID) is true but values.strVal(docID) is null, and it throws 
null pointer exception when trying to add the value to distinctValues set.
the solr query is stats=true&stats.field=X&stats.calcdistinct=true

stack trace:
java.lang.NullPointerException at java.util.TreeMap.put(TreeMap.java:567) at 
java.util.TreeSet.add(TreeSet.java:266) at 
org.apache.solr.handler.component.AbstractStatsValues.accumulate(StatsValuesFactory.java:164)
 at 
org.apache.solr.handler.component.StringStatsValues.accumulate(StatsValuesFactory.java:535)
 at 
org.apache.solr.handler.component.SimpleStats.getFieldCacheStats(StatsComponent.java:274)
 at 
org.apache.solr.handler.component.SimpleStats.getStatsFields(StatsComponent.java:225)
 at 
org.apache.solr.handler.component.SimpleStats.getStatsCounts(StatsComponent.java:200)
 at 
org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:68)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) 
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) 
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) 
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) 
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) 
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
 at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) 
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
 at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
 at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) 
at org.eclipse.jetty.server.Server.handle(Server.java:370) at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
 at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
 at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at 
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
 at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) 
at java.lang.Thread.run(Thread.java:804)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4654) Integrate Lucene's sorting and early query termination capabilities into Solr

2014-02-03 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated SOLR-4654:
---

Labels:   (was: gsoc2013)

> Integrate Lucene's sorting and early query termination capabilities into Solr
> -
>
> Key: SOLR-4654
> URL: https://issues.apache.org/jira/browse/SOLR-4654
> Project: Solr
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Trivial
>
> I think there would be some interesting work to do to integrate Lucene's 
> sorting and early query termination capabilities into Solr, in particular 
> (just ideas, maybe they're not all interesting/useful):
>  - configuring a SortingMergePolicy,
>  - figuring out when the sort order of queries matches the sort order of the 
> index segments,
>  - giving the ability to get approximated results when the query is not 
> sorted but only boosted by the sort order of the index,
>  - integration with TimeLimitingCollector: maybe it's better to collect only 
> half of all segments than to fully collect half of the segments,
>  - approximation of the number of matches based on the ratio of collected 
> documents,
>  - ...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889449#comment-13889449
 ] 

ASF subversion and git services commented on LUCENE-5405:
-

Commit 1563868 from [~bmargulies] in branch 'dev/trunk'
[ https://svn.apache.org/r1563868 ]

LUCENE-5405, LUCENE-5406: move changes entries to 4.7.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5406) ShingleAnalyzerWrapper should expose the delegated analyzer as a public final

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889450#comment-13889450
 ] 

ASF subversion and git services commented on LUCENE-5406:
-

Commit 1563868 from [~bmargulies] in branch 'dev/trunk'
[ https://svn.apache.org/r1563868 ]

LUCENE-5405, LUCENE-5406: move changes entries to 4.7.

> ShingleAnalyzerWrapper should expose the delegated analyzer as a public final
> -
>
> Key: LUCENE-5406
> URL: https://issues.apache.org/jira/browse/LUCENE-5406
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
> Fix For: 5.0, 4.7
>
>
> I'm sometimes given a ShingleAnalyzerWrapper that I would like to change the 
> shingle size on, so I need to create a new instance.  However, I don't always 
> know what the underlying analyzer is and I can't access it b/c it is a 
> protected method on a final class.  
> The solution here is to make the getAnalyzer method public final for the 
> ShingleAnalyzerWrapper.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1563855 - /lucene/dev/trunk/lucene/CHANGES.txt

2014-02-03 Thread Benson Margulies
OK, I see the idea. Can Do.

On Mon, Feb 3, 2014 at 7:11 AM, Michael McCandless
 wrote:
> Hmm, I think you need to move trunk's LUCENE-4505's entry down under
> 4.7's section?
>
> Ie, it should be in the same position that it is in on the 4.x branch.
>
> Hmm, LUCENE-5406 should be down in 4.7 as well; it looks like Grant
> back-ported to 4.x.
>
> Basically, very few entries should be under 5.0 :)  We try to backport
> most things except major changes ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Feb 3, 2014 at 7:04 AM,   wrote:
>> Author: bimargulies
>> Date: Mon Feb  3 12:04:33 2014
>> New Revision: 1563855
>>
>> URL: http://svn.apache.org/r1563855
>> Log:
>> LUCENE-5405: changes.txt; and fix a typo of Grant's for LUCENE-5406.
>>
>> Modified:
>> lucene/dev/trunk/lucene/CHANGES.txt
>>
>> Modified: lucene/dev/trunk/lucene/CHANGES.txt
>> URL: 
>> http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?rev=1563855&r1=1563854&r2=1563855&view=diff
>> ==
>> --- lucene/dev/trunk/lucene/CHANGES.txt (original)
>> +++ lucene/dev/trunk/lucene/CHANGES.txt Mon Feb  3 12:04:33 2014
>> @@ -48,10 +48,17 @@ API Changes
>>this term index, pass it directly in your codec, where it can also be 
>> configured
>>per-field. (Robert Muir)
>>
>> -* LUCENE-5388: Remove Reader from Tokenizer's constructor.
>> +* LUCENE-5388: Remove Reader from Tokenizer's constructor and from
>> +  Analyzer's createComponents. TokenStreams now always get their input
>> +  via setReader.
>>(Benson Margulies via Robert Muir - pull request #16)
>>
>> -* LUCENE-5405: Make ShingleAnalzyerWrapper.getWrappedAnalyzer() public 
>> final (gsingers)
>> +* LUCENE-5405: If an analysis component throws an exception, Lucene
>> +  logs the field name to the info stream to assist in
>> +  diagnosis. (Benson Margulies)
>> +
>> +* LUCENE-5406: Make ShingleAnalzyerWrapper.getWrappedAnalyzer() public
>> +  final (gsingers)
>>
>>  Documentation
>>
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1563855 - /lucene/dev/trunk/lucene/CHANGES.txt

2014-02-03 Thread Michael McCandless
Hmm, I think you need to move trunk's LUCENE-4505's entry down under
4.7's section?

Ie, it should be in the same position that it is in on the 4.x branch.

Hmm, LUCENE-5406 should be down in 4.7 as well; it looks like Grant
back-ported to 4.x.

Basically, very few entries should be under 5.0 :)  We try to backport
most things except major changes ...

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 3, 2014 at 7:04 AM,   wrote:
> Author: bimargulies
> Date: Mon Feb  3 12:04:33 2014
> New Revision: 1563855
>
> URL: http://svn.apache.org/r1563855
> Log:
> LUCENE-5405: changes.txt; and fix a typo of Grant's for LUCENE-5406.
>
> Modified:
> lucene/dev/trunk/lucene/CHANGES.txt
>
> Modified: lucene/dev/trunk/lucene/CHANGES.txt
> URL: 
> http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/CHANGES.txt?rev=1563855&r1=1563854&r2=1563855&view=diff
> ==
> --- lucene/dev/trunk/lucene/CHANGES.txt (original)
> +++ lucene/dev/trunk/lucene/CHANGES.txt Mon Feb  3 12:04:33 2014
> @@ -48,10 +48,17 @@ API Changes
>this term index, pass it directly in your codec, where it can also be 
> configured
>per-field. (Robert Muir)
>
> -* LUCENE-5388: Remove Reader from Tokenizer's constructor.
> +* LUCENE-5388: Remove Reader from Tokenizer's constructor and from
> +  Analyzer's createComponents. TokenStreams now always get their input
> +  via setReader.
>(Benson Margulies via Robert Muir - pull request #16)
>
> -* LUCENE-5405: Make ShingleAnalzyerWrapper.getWrappedAnalyzer() public final 
> (gsingers)
> +* LUCENE-5405: If an analysis component throws an exception, Lucene
> +  logs the field name to the info stream to assist in
> +  diagnosis. (Benson Margulies)
> +
> +* LUCENE-5406: Make ShingleAnalzyerWrapper.getWrappedAnalyzer() public
> +  final (gsingers)
>
>  Documentation
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread Benson Margulies (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benson Margulies resolved LUCENE-5405.
--

Resolution: Fixed

backported, CHANGES.txt filled in. 'this time for sure'

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889430#comment-13889430
 ] 

ASF subversion and git services commented on LUCENE-5405:
-

Commit 1563855 from [~bmargulies] in branch 'dev/trunk'
[ https://svn.apache.org/r1563855 ]

LUCENE-5405: changes.txt; and fix a typo of Grant's for LUCENE-5406.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889432#comment-13889432
 ] 

ASF subversion and git services commented on LUCENE-5405:
-

Commit 1563857 from [~bmargulies] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1563857 ]

LUCENE-5405: some leftover merge info.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5406) ShingleAnalyzerWrapper should expose the delegated analyzer as a public final

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889431#comment-13889431
 ] 

ASF subversion and git services commented on LUCENE-5406:
-

Commit 1563855 from [~bmargulies] in branch 'dev/trunk'
[ https://svn.apache.org/r1563855 ]

LUCENE-5405: changes.txt; and fix a typo of Grant's for LUCENE-5406.

> ShingleAnalyzerWrapper should expose the delegated analyzer as a public final
> -
>
> Key: LUCENE-5406
> URL: https://issues.apache.org/jira/browse/LUCENE-5406
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
> Fix For: 5.0, 4.7
>
>
> I'm sometimes given a ShingleAnalyzerWrapper that I would like to change the 
> shingle size on, so I need to create a new instance.  However, I don't always 
> know what the underlying analyzer is and I can't access it b/c it is a 
> protected method on a final class.  
> The solution here is to make the getAnalyzer method public final for the 
> ShingleAnalyzerWrapper.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889427#comment-13889427
 ] 

ASF subversion and git services commented on LUCENE-5405:
-

Commit 1563853 from [~bmargulies] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1563853 ]

LUCENE-5405: CHANGES.txt

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread Benson Margulies (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889424#comment-13889424
 ] 

Benson Margulies commented on LUCENE-5405:
--

rev 1563850 provides the backport.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread Benson Margulies (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benson Margulies updated LUCENE-5405:
-

Fix Version/s: 4.7

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889423#comment-13889423
 ] 

ASF subversion and git services commented on LUCENE-5405:
-

Commit 1563850 from [~bmargulies] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1563850 ]

LUCENE-5405: backport to 4.x.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5431) Add FSLockFactory.toString()

2014-02-03 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5431:
---

Attachment: LUCENE-5431.patch

* Add toString to FSLockFactory
* Fix FilterDirectory to override getLockID
* Fix TestCrashCausesCorruptIndex to extend FilterDirectory

Does anyone see any problem with fixing toString, i.e. if there's an app that 
could be affected by that?

Also, could we simply toString() impls of all Directory and LockFactory to use 
class.getSimpleName(), to shorten the string?

> Add FSLockFactory.toString()
> 
>
> Key: LUCENE-5431
> URL: https://issues.apache.org/jira/browse/LUCENE-5431
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Attachments: LUCENE-5431.patch
>
>
> FSLockFactory doesn't override toString, which causes Dir.toString() to print 
> the class.name@instance. I think it would be better if it printed e.g. the 
> lockDir.
> I added it but TestCrashCausesCorruptIndex failed because it declares a 
> Directory which doesn't override getLockID(), which returns toString(). I 
> changed that Directory to extend FilterDirectory, and fixed FilterDirectory 
> to override getLockID() to call in.getLockID().
> Will attach a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5593) shard leader loss due to ZK session expiry

2014-02-03 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889416#comment-13889416
 ] 

Christine Poerschke commented on SOLR-5593:
---

Uploaded https://github.com/apache/lucene-solr/pull/27 which rather than 
relaxing the error handling for the getLeaderRetry call actually tries to 
completely avoid it in the first place (if circumstances seem to permit it i.e. 
the request said it came from the leader and we don't think we are leader and 
we could not be sub-shard leader).

> shard leader loss due to ZK session expiry
> --
>
> Key: SOLR-5593
> URL: https://issues.apache.org/jira/browse/SOLR-5593
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Christine Poerschke
>Assignee: Mark Miller
> Fix For: 5.0, 4.7
>
> Attachments: CoreAdminHandler.patch
>
>
> The problem we saw was that the shard leader ceased to be shard leader (in 
> our case due to its zookeeper session expiring). The followers thus rejected 
> update requests (DistributedUpdateProcessor setupRequest's call to 
> ZkStateReader getLeaderRetry) and the leader asked them to recover 
> (DistributedUpdateProcessor doFinish). The followers published themselves as 
> recovering (CoreAdminHandler handleRequestRecoveryAction) and the shard 
> leader loss triggered an election in which none of the followers became the 
> leader due to their recovering state (ShardLeaderElectionContext 
> shouldIBeLeader). The former shard leader also did not become shard leader 
> because its new seq number placed it after the existing replicas 
> (LeaderElector checkIfIamLeader seq <= intSeqs.get(0)).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5431) Add FSLockFactory.toString()

2014-02-03 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5431:
--

 Summary: Add FSLockFactory.toString()
 Key: LUCENE-5431
 URL: https://issues.apache.org/jira/browse/LUCENE-5431
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/store
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor


FSLockFactory doesn't override toString, which causes Dir.toString() to print 
the class.name@instance. I think it would be better if it printed e.g. the 
lockDir.

I added it but TestCrashCausesCorruptIndex failed because it declares a 
Directory which doesn't override getLockID(), which returns toString(). I 
changed that Directory to extend FilterDirectory, and fixed FilterDirectory to 
override getLockID() to call in.getLockID().

Will attach a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



lucene-solr pull request: skip bulk of setupRequest processing if request i...

2014-02-03 Thread cpoerschke
GitHub user cpoerschke opened a pull request:

https://github.com/apache/lucene-solr/pull/27

skip bulk of setupRequest processing if request is FROMLEADER

For https://issues.apache.org/jira/i#browse/SOLR-5593.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bloomberg/lucene-solr branch_4x-solr-5593

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/27.patch


commit 3e70be67f7c2934d3bf8e707fa99f8eaea803781
Author: Christine Poerschke 
Date:   2014-01-20T19:32:53Z

skip bulk of setupRequest processing if request is FROMLEADER and we 
couldn't be sub-shard leader




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5416) Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()

2014-02-03 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889410#comment-13889410
 ] 

Paul Elschot commented on LUCENE-5416:
--

Recent comments at LUCENE-5425 indicate that performance of OpenBitSetIterator 
is critical for facets.
Would it be correct to conclude from that and from the measurements here that 
faceting involves nextDoc() on high bit densities?

That would also mean that the FixedBitSetDBI here could be better for as a 
general filter with a lower bit density,
and that could have implications for LUCENE-5293.
For even lower bit densities a more compressed version is preferable, and we 
already have WAH8DocIdSet for that.

For the DocBlockIterator at LUCENE-5092 I'll stick to the FixedBitSetDBI for 
now, but since it does a prevDoc() under the hood,
it might be a good idea to use the same technique there as in 
OpenBitSetIterator, only backwards.

Would someone have an idea how to merge the Long.numberOfTrailingZeros() used 
here for advance() into OpenBitSetIterator?
Or would it be better to always choose a DocIdSet implementation based on bit 
density?


> Performance of a FixedBitSet variant that uses Long.numberOfTrailingZeros()
> ---
>
> Key: LUCENE-5416
> URL: https://issues.apache.org/jira/browse/LUCENE-5416
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Paul Elschot
>Priority: Minor
> Fix For: 5.0
>
>
> On my machine the current byte index used in OpenBitSetIterator is slower 
> than Long.numberOfTrailingZeros() for advance().
> The pull request contains the code for benchmarking this taken from an early 
> stage of DocBlocksIterator.
> In case the benchmark shows improvements on more machines, well, we know what 
> to do...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5404) Add support to get number of entries a Suggester Lookup was built with and minor refactorings

2014-02-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889408#comment-13889408
 ] 

Michael McCandless commented on LUCENE-5404:


Thanks Areek.

{quote}
Regarding .store/.load, I was thinking if we could use something more general 
so that we can read/write directories (for AnalyzingInfixSuggester and co), 
along with files (for other suggesters)? I think that will let all suggester 
impl to respect the Lookup API. Any thoughts on this?
{quote}

I'm not really sure what to do w/ the store/load APIs.  It may be too much to 
ask that all suggesters use a common API for it.  E.g, it's sort of weird to 1) 
create a new suggester class, and 2) call its load API; it's more natural to 
create the suggester, passing in a Dir/File where it should load its state 
from.  Ie, you are either loading a previously built suggester, or you creating 
a new one.  Today the API allows you to load a previously built one and then 
also .build() a new one over it, which is strange.  I think LUCENE-4492 is 
getting at this too ...

E.g., AnalyzingInfixSuggester cannot do this: it loads itself based on what you 
passed to the ctor, so it's .load does nothing.

> Add support to get number of entries a Suggester Lookup was built with and 
> minor refactorings
> -
>
> Key: LUCENE-5404
> URL: https://issues.apache.org/jira/browse/LUCENE-5404
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Areek Zillur
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5404.patch, LUCENE-5404.patch, LUCENE-5404.patch
>
>
> It would be nice to be able to tell the number of entries a suggester lookup 
> was built with. This would let components using lookups to keep some stats 
> regarding how many entries were used to build a lookup.
> Additionally, Dictionary could use InputIterator rather than the 
> BytesRefIteratator, as most of the implmentations now use it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread Benson Margulies (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889406#comment-13889406
 ] 

Benson Margulies commented on LUCENE-5405:
--

Will do. Thanks, this is exactly what sort of feedback I was looking for.

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5405) Exception strategy for analysis improved

2014-02-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889401#comment-13889401
 ] 

Michael McCandless commented on LUCENE-5405:


Thanks Benson, I didn't realize this would get tricky!

Since success2 seems to be equivalent to succeededInProcessingField, maybe just 
rename success2 instead of adding a new variable?

Also, could you move the infoStream output to the end of that finally clause 
(so we're sure to call stream.close())?  Paranoia ...

Otherwise it looks great.  Thanks!

> Exception strategy for analysis improved
> 
>
> Key: LUCENE-5405
> URL: https://issues.apache.org/jira/browse/LUCENE-5405
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Benson Margulies
>Assignee: Benson Margulies
> Fix For: 5.0
>
> Attachments: LUCENE-5405-4.x.patch
>
>
> SOLR-5623 included some conversation about the dilemmas of exception 
> management and reporting in the analysis chain. 
> I've belatedly become educated about the infostream, and this situation is a 
> job for it. The DocInverterPerField can note exceptions in the analysis 
> chain, log out to the infostream, and then rethrow them as before. No 
> wrapping, no muss, no fuss.
> There are comments on this JIRA from a more complex prior idea that readers 
> might want to ignore.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5689) On reconnect, ZkController cancels election on first context rather than latest

2014-02-03 Thread Daniel Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889390#comment-13889390
 ] 

Daniel Collins commented on SOLR-5689:
--

My understanding of what `LeaderElector.setup()` does is that it just creates 
the `/overseer_elect/election` "directory" in ZK.  This isn't ephemeral, so in 
reality should only be a one-off job?  Unless ZK has been wiped whilst the node 
was disconnected from ZK, that directory should still be there.  It shouldn't 
hurt to add in the call to setup in reconnect, but I don't believe it is 
necessary.

cancelElection() removes the `leaderSeqPath` which is the ephemeral node(s) 
under that "directory", e.g. 
"19127283862405127-xxx:y_solr-n_000368" in my case.

> On reconnect, ZkController cancels election on first context rather than 
> latest
> ---
>
> Key: SOLR-5689
> URL: https://issues.apache.org/jira/browse/SOLR-5689
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1, 5.0, 4.7
>Reporter: Gregory Chanan
>
> I haven't tested this yet, so I could be wrong, but this is my reading of the 
> code:
> During init:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient, overseer, 
> getNodeName());
> overseerElector.setup(context);
> overseerElector.joinElection(context, false);
> {code}
> On reconnect:
> {code}
> ElectionContext context = new OverseerElectionContext(zkClient,overseer, 
> getNodeName());
>   
> ElectionContext prevContext = overseerElector.getContext();
> if (prevContext != null) {
>   prevContext.cancelElection();
> }
>   
> overseerElector.joinElection(context, true);
> {code}
> setup doesn't appear to be called on reconnect, so the new context is never 
> set and the first context gets cancelled over and over.
> A call to overseerElector.setup(context); before joinElection in the 
> reconnect case would address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5688) Allow updating of soft and hard commit parameters using HTTP API

2014-02-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889341#comment-13889341
 ] 

Rafał Kuć commented on SOLR-5688:
-

Thanks for the comments, I'll provide the updated patch later today :)

> Allow updating of soft and hard commit parameters using HTTP API
> 
>
> Key: SOLR-5688
> URL: https://issues.apache.org/jira/browse/SOLR-5688
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.6.1
>Reporter: Rafał Kuć
> Fix For: 5.0
>
> Attachments: SOLR-5688.patch
>
>
> Right now, to update the values (max time and max docs) for hard and soft 
> autocommits one has to alter the configuration and reload the core. I think 
> it may be nice, to expose an API to do that in a way, that the configuration 
> is not updated, so the change is not persistent. 
> There may be various reasons for doing that - for example one may know that 
> the application will send large amount of data and want to prepare for that. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org