Re: [Lucene.Net] Blog Setup

2012-02-16 Thread Stefan Bodewig
On 2012-02-15, Christopher Currens wrote:

 That's similar to a suggestion Stefan made in another email:

 The only alternative would be [...] running a
 dynamic server on a dedicated VM.  The later would
 be easier to negotiate for a top level project.

 Though, his response seems to imply that it would need to stay hosted on
 Apache servers?

That's not what I meant.  It's more an if it stays on Apache
infrastructure then 

Personally I'd prefer to keep our stuff together in a single place, but
there is no hard requirement.

Stefan


[Lucene.Net] Re: trouble getting cms content to work correctly

2012-02-16 Thread Joe Schaefer
Modulo this particular bug affecting only your publication
requests, massive documentation commits merely require
some follow-through (to publication) as I've written about
today here:

http://www.apache.org/dev/cmsref.html#mass-change

So the regularity with which you do this won't present any
particular problems other than increasing the frequency of
the subsequent pain you needto endure to walk mass-changes
through to the live site.  Small changes will normally
induce small amounts of pain (modulo this particular bug).


HTH




 From: Prescott Nasser geobmx...@hotmail.com
To: lucene-net-dev@lucene.apache.org 
Cc: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 6:30 PM
Subject: FW: trouble getting cms content to work correctly
 

 
Took all day, but Joe was there babysitting and correcting things for us.
 
Basically there is a bug in svn 1.6.17 that the CMS is based on, which is 
making our commits a pain at the moment. Once that gets upgraded it should be 
relatively smooth sailing.
 
It won't help us though if we want still planning on updating massive amounts 
of documentation on a regular basis.
 
 
 
Thanks Joe, I can't thank you enough for the help today.
 
~Prescott



Date: Wed, 15 Feb 2012 14:49:48 -0800
From: joe_schae...@yahoo.com
Subject: Re: trouble getting cms content to work correctly
To: geobmx...@hotmail.com


After some testing it appears that this performance
bug is fixed in svn 1.7, but the CMS is currently
running 1.6.17.  I hope to have the host upgraded
within the next 30 days or so, but for now I still
recommend using the script.





 From: Prescott Nasser geobmx...@hotmail.com
To: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 5:28 PM
Subject: RE: trouble getting cms content to work correctly
 

 
Alright - sounds good
 
Thanks again!
 
~P
 



Date: Wed, 15 Feb 2012 14:25:45 -0800
From: joe_schae...@yahoo.com
Subject: Re: trouble getting cms content to work correctly
To: geobmx...@hotmail.com


I'm having some svn people look at the merge issues.
Right now all I can suggest is that you publish using
the publish.pl script on people.apache.org.  It's taking
me about 10 min total to carry that out, which is certainly
too long given the nature of the changes it's merging,
but I'll let you know what I find out.





 From: Prescott Nasser geobmx...@hotmail.com
To: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 5:13 PM
Subject: RE: trouble getting cms content to work correctly
 

 
It's butt ugly - all in one directory, 8206 files. I'd prefer a more natural 
docs structure, but that's how it gets generated
 
~P
 



Date: Wed, 15 Feb 2012 14:10:47 -0800
From: joe_schae...@yahoo.com
Subject: Re: trouble getting cms content to work correctly
To: geobmx...@hotmail.com


Ok lemee kill it and use the publish.pl script
on people to see if I can get it to work right.
Just curious tho- about how many files do you
have within that docs dir- all in one dir I presume?



 From: Prescott Nasser geobmx...@hotmail.com
To: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 5:08 PM
Subject: RE: trouble getting cms content to work correctly
 

 
I'm thinking still merge funk
 



Date: Wed, 15 Feb 2012 14:05:21 -0800
From: joe_schae...@yahoo.com
Subject: Re: trouble getting cms content to work correctly
To: geobmx...@hotmail.com


Looks like it just completed.  Hmm, go
ahead and publish and lets try this one
more time.





 From: Joe Schaefer joe_schae...@yahoo.com
To: Prescott Nasser geobmx...@hotmail.com 
Sent: Wednesday, February 15, 2012 5:02 PM
Subject: Re: trouble getting cms content to work correctly
 

Yeah more merge funk. Leave it run for now,
but don't take any further action until you
hear from me.





 From: Prescott Nasser geobmx...@hotmail.com
To: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 4:59 PM
Subject: RE: trouble getting cms content to work correctly
 

 
I hate to be the bearer of bad news... still taking days to publish (I'm 
not sure if there is a merge error or not) let me know I'll kill this 
quick
 



Date: Wed, 15 Feb 2012 13:54:52 -0800
From: joe_schae...@yahoo.com
Subject: Re: trouble getting cms content to work correctly
To: geobmx...@hotmail.com


Yeah try out the webgui and edit/commit/publish
a minor change.  It should take you no more than
a minute or so total.





 From: Prescott Nasser geobmx...@hotmail.com
To: joe_schae...@yahoo.com 
Sent: Wednesday, February 15, 2012 4:52 PM
Subject: RE: trouble getting cms content to work correctly
 

 
Man that sounds like a tool full of awesome!
 
Ok - so for the moment no new docs, a 

RE: Welcome James Dyer

2012-02-16 Thread David Smiley (@MITRE.org)
A belated welcome!  (I'm new too)

-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Welcome-James-Dyer-tp3732469p3749495.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1279) ApostropheTokenizer

2012-02-16 Thread Mauro Asprea (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231
 ] 

Mauro Asprea commented on SOLR-1279:


I confirm this is working using the WordDelimiterFilterFactory like Robert said:

{code}
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0  
preserveOriginal=1
catenateAll=1/  
{code}

The using Solr Admin Analysis page I get the following:
Value: McDonal's

||Indexed Term|
|McDonald's|
|Mc|
|Donald|
|s|
|McDonalds|

One thing: You have to be sure that no previous filters remove the trailing 
's. In my case I had the StandardFilterFactory which does remove tailing 
apostrophes.

 ApostropheTokenizer
 ---

 Key: SOLR-1279
 URL: https://issues.apache.org/jira/browse/SOLR-1279
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Sergey Borisov
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: ApostropheTokenizer.zip


 ApostropheTokenizer creates extra tokens during the analysis stage for the 
 fields containing apostrophes. The reason for adding this is to ensure that 
 documents that differ only by apostrophe have the same relevancy score. 
 For example, if the document contains string McDonald's, it will be 
 tokenized as McDonald's McDonalds. This way when the search is performed 
 against McDonald's or McDonalds will produce similar score.
 This code handles up to two apostrophes in a token.
 To use this tokenizer add the following line in schema.xml
 analyzer type=index
   filter class=org.apache.lucene.analysis.ApostropheTokenFactory/
 ...
 /analyzer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-1279) ApostropheTokenizer

2012-02-16 Thread Mauro Asprea (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231
 ] 

Mauro Asprea edited comment on SOLR-1279 at 2/16/12 9:02 AM:
-

I confirm this is working using the WordDelimiterFilterFactory like Robert said:

{code}
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0  
preserveOriginal=1
catenateAll=1/  
{code}

Then using Solr Admin Analysis page I get the following:
Value: McDonald's

||Indexed Term|
|McDonald's|
|Mc|
|Donald|
|s|
|McDonalds|

One thing: You have to be sure that no previous filters remove the trailing 
's. In my case I had the StandardFilterFactory which does remove tailing 
apostrophes.

  was (Author: brutuscat):
I confirm this is working using the WordDelimiterFilterFactory like Robert 
said:

{code}
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0  
preserveOriginal=1
catenateAll=1/  
{code}

The using Solr Admin Analysis page I get the following:
Value: McDonald's

||Indexed Term|
|McDonald's|
|Mc|
|Donald|
|s|
|McDonalds|

One thing: You have to be sure that no previous filters remove the trailing 
's. In my case I had the StandardFilterFactory which does remove tailing 
apostrophes.
  
 ApostropheTokenizer
 ---

 Key: SOLR-1279
 URL: https://issues.apache.org/jira/browse/SOLR-1279
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Sergey Borisov
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: ApostropheTokenizer.zip


 ApostropheTokenizer creates extra tokens during the analysis stage for the 
 fields containing apostrophes. The reason for adding this is to ensure that 
 documents that differ only by apostrophe have the same relevancy score. 
 For example, if the document contains string McDonald's, it will be 
 tokenized as McDonald's McDonalds. This way when the search is performed 
 against McDonald's or McDonalds will produce similar score.
 This code handles up to two apostrophes in a token.
 To use this tokenizer add the following line in schema.xml
 analyzer type=index
   filter class=org.apache.lucene.analysis.ApostropheTokenFactory/
 ...
 /analyzer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-1279) ApostropheTokenizer

2012-02-16 Thread Mauro Asprea (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209231#comment-13209231
 ] 

Mauro Asprea edited comment on SOLR-1279 at 2/16/12 9:02 AM:
-

I confirm this is working using the WordDelimiterFilterFactory like Robert said:

{code}
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0  
preserveOriginal=1
catenateAll=1/  
{code}

The using Solr Admin Analysis page I get the following:
Value: McDonald's

||Indexed Term|
|McDonald's|
|Mc|
|Donald|
|s|
|McDonalds|

One thing: You have to be sure that no previous filters remove the trailing 
's. In my case I had the StandardFilterFactory which does remove tailing 
apostrophes.

  was (Author: brutuscat):
I confirm this is working using the WordDelimiterFilterFactory like Robert 
said:

{code}
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0  
preserveOriginal=1
catenateAll=1/  
{code}

The using Solr Admin Analysis page I get the following:
Value: McDonal's

||Indexed Term|
|McDonald's|
|Mc|
|Donald|
|s|
|McDonalds|

One thing: You have to be sure that no previous filters remove the trailing 
's. In my case I had the StandardFilterFactory which does remove tailing 
apostrophes.
  
 ApostropheTokenizer
 ---

 Key: SOLR-1279
 URL: https://issues.apache.org/jira/browse/SOLR-1279
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Sergey Borisov
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: ApostropheTokenizer.zip


 ApostropheTokenizer creates extra tokens during the analysis stage for the 
 fields containing apostrophes. The reason for adding this is to ensure that 
 documents that differ only by apostrophe have the same relevancy score. 
 For example, if the document contains string McDonald's, it will be 
 tokenized as McDonald's McDonalds. This way when the search is performed 
 against McDonald's or McDonalds will produce similar score.
 This code handles up to two apostrophes in a token.
 To use this tokenizer add the following line in schema.xml
 analyzer type=index
   filter class=org.apache.lucene.analysis.ApostropheTokenFactory/
 ...
 /analyzer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209247#comment-13209247
 ] 

Tommaso Teofili commented on LUCENE-3731:
-

Right, everything seems ok now.
I also tried to comment the 
{noformat}
property name=tests.threadspercpu value=0 /
{noformat}
line in build.xml in order to execute tests in parallel.
Multiple parallel tests executions, with also -Dtests.multiplier=100, with 
Java6 passed flawlessly; will see if that is the case for Java7 too.

 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Created) (JIRA)
Remove StringField
--

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir


Often on the mailing list there is confusion about NOT_ANALYZED.

Besides being useless (Just use KeywordAnalyzer instead), people trip up on this
not being consistent at query time (you really need to configure 
KeywordAnalyzer for the field 
on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
oh wait
once you've done that, you dont need NOT_ANALYZED).

So I think StringField is a trap too for the same reasons, just under a 
different name, lets remove it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3792:


 Priority: Blocker  (was: Major)
Affects Version/s: 4.0
Fix Version/s: 4.0

Setting this as blocker (sorry).

Its a huge trap when someone sets the same Analyzer on IndexWriter and 
QueryParser
but the analysis isn't actually the same.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209254#comment-13209254
 ] 

Robert Muir commented on LUCENE-3792:
-

On 3.x, I'd like to deprecate NOT_ANALYZED for the same reasons. This at 
least discourages people from running into that trap there and using
KeywordAnalyzer instead.


 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209256#comment-13209256
 ] 

Uwe Schindler commented on LUCENE-3792:
---

The backside of this is now, that you need to explicitely use a KeywordAnalyzer 
now for Primary Key fields. If you don't run those through a query analyzer 
(e.g. generally produce TermQuery directly) then you have lots of additional 
work. For simple lookup queries and indexing a PK key, this is a no go.

-1 on removing that completely, it should simply called different. We should 
maybe add PKQuery and PKField to have a symmetry.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3792) Remove StringField

2012-02-16 Thread Uwe Schindler (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209256#comment-13209256
 ] 

Uwe Schindler edited comment on LUCENE-3792 at 2/16/12 10:46 AM:
-

The backside of this is now, that you need to explicitely use a KeywordAnalyzer 
now for Primary Key fields. If you don't run those through a query analyzer 
(e.g. generally produce TermQuery directly) then you have lots of additional 
work. For simple lookup queries and indexing a PK key, this is a no go.

-1 on removing that completely, it should simply called different. We should 
maybe add PKQuery (a ConstantScore TermQuery) and PKField to have a symmetry.

  was (Author: thetaphi):
The backside of this is now, that you need to explicitely use a 
KeywordAnalyzer now for Primary Key fields. If you don't run those through a 
query analyzer (e.g. generally produce TermQuery directly) then you have lots 
of additional work. For simple lookup queries and indexing a PK key, this is 
a no go.

-1 on removing that completely, it should simply called different. We should 
maybe add PKQuery and PKField to have a symmetry.
  
 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209258#comment-13209258
 ] 

Robert Muir commented on LUCENE-3792:
-

Well we are at a standstill. We constantly get these problems on the users list 
from NOT_ANALYZED
and I don't like reintroducing the trap again.

So I'm -1 to StringField in 4.0

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209260#comment-13209260
 ] 

Uwe Schindler commented on LUCENE-3792:
---

I said call it different.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209267#comment-13209267
 ] 

Robert Muir commented on LUCENE-3792:
-

{quote}
If you don't run those through a query analyzer (e.g. generally produce 
TermQuery directly) then you have lots of additional work. 
{quote}

Thats not true, because keywordanalyzer does nothing to the terms, you can 
continue to produce termquery directly and it will work.
So expert users are fine.

This issue isnt about expert users, its about how our API traps people that are 
not expert users.


 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3792) Remove StringField

2012-02-16 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3792:
--

Priority: Major  (was: Blocker)

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209269#comment-13209269
 ] 

Uwe Schindler commented on LUCENE-3792:
---

bq. Well we are at a standstill. We constantly get these problems on the users 
list from NOT_ANALYZED

You cannot prevent users from doing the wrong thing. If you remove StringField 
completely, pleaese also remove NumericField and force users to use 
PerFieldAnalyzerWrapper with a NumericTokenStream. If you add a numeric field 
you cannot ask for it with query parser. If you add a StringField, you cann ask 
with QueryParser. Simple rule. It must just be clearly documented. And possible 
StringField renamed.

People using primary keys or other untokenized values should simply not use 
QueryParser. Use a ComstantScoreTermyQuery and you are fine.

This is all just a documentation problem, so I am completely against removing 
that. Not everybody is using Lucene purely as a full-text engine.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209272#comment-13209272
 ] 

Robert Muir commented on LUCENE-3792:
-

{quote}
If you remove StringField completely, pleaese also remove NumericField and 
force users to use PerFieldAnalyzerWrapper with a NumericTokenStream.
{quote}

I actually am not sure this is such a bad idea?

If we were to enforce such a thing, it would also be possible to add a 
modification to the queryparser (instanceof NumericTokenStream)
so that numeric fields then work out of the box with the query parser nicely?

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209273#comment-13209273
 ] 

Robert Muir commented on LUCENE-3792:
-

{quote}
Not everybody is using Lucene purely as a full-text engine.
{quote}

But we cannot let non-fulltext uses break the design for the intended use 
case (full-text).

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3793) Use ReferenceManager in DirectoryTaxonomyReader

2012-02-16 Thread Shai Erera (Created) (JIRA)
Use ReferenceManager in DirectoryTaxonomyReader
---

 Key: LUCENE-3793
 URL: https://issues.apache.org/jira/browse/LUCENE-3793
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.6, 4.0


DirTaxoReader uses hairy code to protect its indexReader instance from 
being modified while threads use it. It maintains a ReentrantLock 
(indexReaderLock) which is obtained on every 'read' access, while 
refresh() locks it for 'write' operations (refreshing the IndexReader). 

Instead of all that, now that we have ReferenceManager in place, I think 
that we can write a ReaderManagerIndexReader which will be used by 
DirTR. Every method that requires access to the indexReader will 
acquire/release (not too different than obtaining/releasing the read 
lock), and refresh() will call ReaderManager.maybeRefresh(). It will 
simplify the code and remove some rather long comments, that go into 
great length explaining why does the code looks like that. 

This ReaderManager cannot be used for every IndexReader, because DirTR's
refresh() logic is special -- it reopens the indexReader, and then
verifies that the createTime still matches on the reopened reader as
well. Otherwise, it closes the reopened reader and fails with an exception.
Therefore, this ReaderManager.refreshIfNeeded will need to take the
createTime into consideration and fail if they do not match.

And while we're at it ... I wonder if we should have a manager for an
IndexReader/ParentArray pair? I think that it makes sense because we
don't want DirTR to use a ParentArray that does not match the IndexReader.
Today this can happen in refresh() if e.g. after the indexReader instance
has been replaced, parentArray.refresh(indexReader) fails. DirTR will be
left with a newer IndexReader instance, but old (or worse, corrupt?)
ParentArray ... I think it'll be good if we introduce clone() on ParentArray,
or a new ctor which takes an int[].

I'll work on a patch once I finish with LUCENE-3786.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)

2012-02-16 Thread Shai Erera (Created) (JIRA)
DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing 
DirTaxoReader.refresh() to falsely succeed (or fail)
-

 Key: LUCENE-3794
 URL: https://issues.apache.org/jira/browse/LUCENE-3794
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0


DirTaxoWriter sets createTime to null after it put it in the commit data once. 
But that's wrong because if one calls commit(Map) twice, the second time 
doesn't record the creation time. Also, in the ctor, if an index exists and 
OpenMode is not CREATE, the creation time property is not read.

I wrote a couple of unit tests that assert this, and modified DirTaxoWriter to 
always record the creation time (in every commit) -- that's the only safe way.

Will upload a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)

2012-02-16 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3794:
---

Attachment: LUCENE-3794.patch

Patch fixes the bug + adds a couple of test cases to ensure the correct 
behavior.

 DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing 
 DirTaxoReader.refresh() to falsely succeed (or fail)
 -

 Key: LUCENE-3794
 URL: https://issues.apache.org/jira/browse/LUCENE-3794
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3794.patch


 DirTaxoWriter sets createTime to null after it put it in the commit data 
 once. But that's wrong because if one calls commit(Map) twice, the second 
 time doesn't record the creation time. Also, in the ctor, if an index exists 
 and OpenMode is not CREATE, the creation time property is not read.
 I wrote a couple of unit tests that assert this, and modified DirTaxoWriter 
 to always record the creation time (in every commit) -- that's the only safe 
 way.
 Will upload a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3792:


Attachment: LUCENE-3792_javadocs_3x.patch

Its obvious Uwe and I aren't going to agree here immediately, so here is a 
patch adding a big warning to 3.x javadocs.

For now I'd like to apply the same warning to StringField in trunk (I just made 
the patch against 3.x)

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3792:


Attachment: LUCENE-3792_javadocs_3x.patch

Sorry, incomplete wording (I forgot to save before svn diff).


 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209301#comment-13209301
 ] 

Tommaso Teofili commented on LUCENE-3731:
-

some improvement in performance came out releasing the CAS and AE on close() 
call

{noformat}
  @Override
  public void close() throws IOException {
super.close();
// release UIMA resources
cas.release();
ae.destroy();
  }
{noformat}

Now investigating the use of CASPool for improving throughput on high usages 
scenarios.

 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209304#comment-13209304
 ] 

Robert Muir commented on LUCENE-3731:
-

Is that safe to do in Tokenizer.close() ?

Because Tokenizer.close() is misleading/confusing, the instance is still reused 
after 
this for subsequent documents... in other words Tokenizer.close() closes 
resources like
the Reader itself... it just happens to be that CAS/AE don't complain about you 
continuing to use them after they are release()'ed/destroy()'ed :)


 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_speed.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer

2012-02-16 Thread Iulius Curt (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209305#comment-13209305
 ] 

Iulius Curt commented on LUCENE-3109:
-

Is this still valid? (It looks like a good place for me to enter the community)

Should also the *FieldsReader/Writer classes that derive 
FieldsProducer/Consumer become *InvertedFieldsReader/Writer?

 Rename FieldsConsumer to InvertedFieldsConsumer
 ---

 Key: LUCENE-3109
 URL: https://issues.apache.org/jira/browse/LUCENE-3109
 Project: Lucene - Java
  Issue Type: Task
  Components: core/codecs
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


 The name FieldsConsumer is missleading here it really is an 
 InvertedFieldsConsumer and since we are extending codecs to consume 
 non-inverted Fields we should be clear here. Same applies to Fields.java as 
 well as FieldsProducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer

2012-02-16 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209309#comment-13209309
 ] 

Simon Willnauer commented on LUCENE-3109:
-

bq. Is this still valid? (It looks like a good place for me to enter the 
community)

I think so there should also be an InvertedFieldsProducer

 Rename FieldsConsumer to InvertedFieldsConsumer
 ---

 Key: LUCENE-3109
 URL: https://issues.apache.org/jira/browse/LUCENE-3109
 Project: Lucene - Java
  Issue Type: Task
  Components: core/codecs
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0


 The name FieldsConsumer is missleading here it really is an 
 InvertedFieldsConsumer and since we are extending codecs to consume 
 non-inverted Fields we should be clear here. Same applies to Fields.java as 
 well as FieldsProducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3794) DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing DirTaxoReader.refresh() to falsely succeed (or fail)

2012-02-16 Thread Shai Erera (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3794.


Resolution: Fixed

Committed revision 1244960 (3x).
Committed revision 1244964 (trunk).

 DirectoryTaxonomyWriter can lose the INDEX_CREATE_TIME property, causing 
 DirTaxoReader.refresh() to falsely succeed (or fail)
 -

 Key: LUCENE-3794
 URL: https://issues.apache.org/jira/browse/LUCENE-3794
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3794.patch


 DirTaxoWriter sets createTime to null after it put it in the commit data 
 once. But that's wrong because if one calls commit(Map) twice, the second 
 time doesn't record the creation time. Also, in the ctor, if an index exists 
 and OpenMode is not CREATE, the creation time property is not read.
 I wrote a couple of unit tests that assert this, and modified DirTaxoWriter 
 to always record the creation time (in every commit) -- that's the only safe 
 way.
 Will upload a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3760) Cleanup DR.getCurrentVersion/DR.getUserData/DR.getIndexCommit().getUserData()

2012-02-16 Thread Shai Erera (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3760.


   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

Resolving back ... looks like I'm the only one that it bothers.

 Cleanup DR.getCurrentVersion/DR.getUserData/DR.getIndexCommit().getUserData()
 -

 Key: LUCENE-3760
 URL: https://issues.apache.org/jira/browse/LUCENE-3760
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3760.patch, LUCENE-3760.patch


 Spinoff from Ryan's dev thread DR.getCommitUserData() vs 
 DR.getIndexCommit().getUserData()... these methods are confusing/dups right 
 now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Michael McCandless (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-3776:
--

Assignee: Michael McCandless

 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3776:
---

Attachment: LUCENE-3776.patch

Patch, cutting over NRTManager to subclass ReferenceManager, and also
some minor cleanups to ReferenceManager/SearcherManager.

I added a method, afterRefresh, to ReferenceManager, which it calls
after a refresh; NRTManager needs this so it can
notify any waiting threads that the new gen is now live.


 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3776.patch


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12433 - Failure

2012-02-16 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12433/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.RecoveryZkTest.testDistribSearch

Error Message:
expected:501 but was:432

Stack Trace:
junit.framework.AssertionFailedError: expected:501 but was:432
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.apache.solr.cloud.RecoveryZkTest.doTest(RecoveryZkTest.java:105)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)




Build Log (for compile errors):
[...truncated 8153 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209392#comment-13209392
 ] 

Shai Erera commented on LUCENE-3776:


Patch looks good !

*SearcherManager*
with these changes, if the app passes an IndexReader that is not 
DirectoryReader, it will get ClassCastException (if asserts are disabled). Is 
that ok? Perhaps it'd be better if you check that in SM's ctor and throw 
IllegalArgumentException? The problem is that app cannot pass DirReader in 3x, 
so this will apply to trunk only. In fact, I think that for trunk it will be 
better if SM declared it expects a DirectoryReader up front?

We cannot avoid the cast in refreshIfNeeded because IR is obtained from IS, but 
at least the app won't hit ClassCastExceptions after it created SM?

That kinda makes SearcherManager a DirReader only impl which is unfortunate 
IMO. But I'm not sure if any IR can openIfChanged() anymore, so perhaps that's 
unavoidable.

*ReferenceManager*
About close() -- do you think it'll be better to keep close() final, and 
introduce a new protected closeResource()/closeInternal() that NRTManager can 
override? That way, RefManagers won't accidentally override close() and forget 
to call super.close()?

About afterRefresh() -- I'll admit that first I didn't understand why you need 
it. Previously, it was used to warm an IndexSearcher, but now we say it's the 
responsibility of SearcherFactory. I can see why it's useful for NRTManager, 
and it might even help me in LUCENE-3793 ! Do you think that we should declare 
that it can throw IOE? I know that if I'll use it in LUCENE-3793, I'll need 
that and I'd hate to throw RuntimeException. NRTManager can still override and 
not declare that. I'm just thinking that since almost all methods declare 
throwing IOE, it won't be odd if we declare it too on afterRefresh(), and it's 
not unlikely that afterRefresh() will do something that throws exceptions.

*NRTManager*
About openIfNeeded:
# Can you cast to DirectoryReader once? I don't know if the assert is better 
than a ClassCastException ... with how the code is written, ClassCastException 
is better than assert because at least it will tell the user what went wrong?
# How critical it is to declare newSearcher final? If you didn't, you could 
init it to null, and only change if newReader != null. Saving 4 lines of code 
(improves readability IMO -- something that I know you care about :)).


 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3776.patch


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3079) Backport of Solr-1431 (CommComponent abstracted)

2012-02-16 Thread Erick Erickson (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-3079:
-

Attachment: SOLR-3079.patch

The patch isn't in SVN format, looks like you made it with Git? The git repo is 
a shadow repository, not used for released code as far as I know.

Through the magic of IntelliJ, I managed to apply the patch and I'm uploading 
that version. Can you take a look and see if it made it through the 
transformations OK?

And any Git people out there; is there magic to make Git produce a 
SVN-compatibile patch? Seems like a good addition to the How to contribute 
page, lots of people seem to be using Git...

Beyond that, I'll run the tests with it and report back if there's a problem. 
I'd really like someone who knows what this is all about to take a look before 
committing

Meanwhile, keep prompting G

 Backport of Solr-1431 (CommComponent abstracted)
 

 Key: SOLR-3079
 URL: https://issues.apache.org/jira/browse/SOLR-3079
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 3.5
Reporter: Greg Bowyer
 Attachments: 0001-Initial-backport-of-solr-cloud-ShardHandler.patch, 
 SOLR-3079.patch


 Initial attempt at backporting the work done for Solr-1431 into the 3.x series

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3079) Backport of Solr-1431 (CommComponent abstracted)

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209399#comment-13209399
 ] 

Robert Muir commented on SOLR-3079:
---

{quote}
And any Git people out there; is there magic to make Git produce a 
SVN-compatibile patch? Seems like a good addition to the How to contribute 
page, lots of people seem to be using Git...
{quote}

I just use patch -p1 when I want to apply git patches... (eclipse has a 
checkbox or some other gui-toggle for -p if you prefer guis)

 Backport of Solr-1431 (CommComponent abstracted)
 

 Key: SOLR-3079
 URL: https://issues.apache.org/jira/browse/SOLR-3079
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 3.5
Reporter: Greg Bowyer
 Attachments: 0001-Initial-backport-of-solr-cloud-ShardHandler.patch, 
 SOLR-3079.patch


 Initial attempt at backporting the work done for Solr-1431 into the 3.x series

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209430#comment-13209430
 ] 

Robert Muir commented on LUCENE-3792:
-

OK, i think seriously it would take major work to do something here that would 
make everyone happy.

I still don't like the situation, but unless there are serious objections, I'd 
like to commit the javadocs,
just to hopefully reduce the amount of time this trap gets answered on the user 
list.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731_rsrel.patch

bq. Because Tokenizer.close() is misleading/confusing, the instance is still 
reused after 
this for subsequent documents.

When I call close() it looks the correct way one could reuse that Tokenizer 
instance is by calling reset(someOtherInput) before doing anything else, so, 
after adding 

{code}
assert reader != null : input has been closed, please reset it;
{code}

as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I 
tried this test:
{code}

  @Test
  public void testSetReaderAndClose() throws Exception {
StringReader input = new StringReader(the big brown fox jumped on the 
wood);
Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, 
org.apache.uima.TokenAnnotation, input);
assertTokenStreamContents(t, new String[]{the, big, brown, fox, 
jumped, on, the, wood});
t.close();
try {
  t.incrementToken();
  fail(should've been failed as reader is not set);
} catch (AssertionError error) {
  // ok
}
input = new StringReader(hi oh my);
t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, 
org.apache.lucene.uima.ts.TokenAnnotation, input);
assertTrue(should've been incremented , t.incrementToken());
t.close();
try {
  t.incrementToken();
  fail(should've been failed as reader is not set);
} catch (AssertionError error) {
  // ok
}
t.reset(new StringReader(hey what do you say));
assertTrue(should've been incremented , t.incrementToken());
  }

{code}

and it looks to me it's behaving correctly.
Still working on improving it and trying to catch possible corner cases.


 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209439#comment-13209439
 ] 

Tommaso Teofili edited comment on LUCENE-3731 at 2/16/12 3:40 PM:
--

bq. Because Tokenizer.close() is misleading/confusing, the instance is still 
reused after 
this for subsequent documents.

When I call close() it looks the correct way one could reuse that Tokenizer 
instance is by calling reset(someOtherInput) before doing anything else, so, 
after adding 

{code}
assert reader != null : input has been closed, please reset it;
{code}

as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I 
tried this test:
{code}

  @Test
  public void testSetReaderAndClose() throws Exception {
StringReader input = new StringReader(the big brown fox jumped on the 
wood);
Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, 
org.apache.uima.TokenAnnotation, input);
assertTokenStreamContents(t, new String[]{the, big, brown, fox, 
jumped, on, the, wood});
t.close();
try {
  t.incrementToken();
  fail(should've been failing as reader is not set);
} catch (AssertionError error) {
  // ok
}
input = new StringReader(hi oh my);
t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, 
org.apache.lucene.uima.ts.TokenAnnotation, input);
assertTrue(should've been incremented , t.incrementToken());
t.close();
try {
  t.incrementToken();
  fail(should've been failing as reader is not set);
} catch (AssertionError error) {
  // ok
}
t.reset(new StringReader(hey what do you say));
assertTrue(should've been incremented , t.incrementToken());
  }

{code}

and it looks to me it's behaving correctly.
Still working on improving it and trying to catch possible corner cases.


  was (Author: teofili):
bq. Because Tokenizer.close() is misleading/confusing, the instance is 
still reused after 
this for subsequent documents.

When I call close() it looks the correct way one could reuse that Tokenizer 
instance is by calling reset(someOtherInput) before doing anything else, so, 
after adding 

{code}
assert reader != null : input has been closed, please reset it;
{code}

as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I 
tried this test:
{code}

  @Test
  public void testSetReaderAndClose() throws Exception {
StringReader input = new StringReader(the big brown fox jumped on the 
wood);
Tokenizer t = new UIMAAnnotationsTokenizer(/uima/AggregateSentenceAE.xml, 
org.apache.uima.TokenAnnotation, input);
assertTokenStreamContents(t, new String[]{the, big, brown, fox, 
jumped, on, the, wood});
t.close();
try {
  t.incrementToken();
  fail(should've been failed as reader is not set);
} catch (AssertionError error) {
  // ok
}
input = new StringReader(hi oh my);
t = new UIMAAnnotationsTokenizer(/uima/TestAggregateSentenceAE.xml, 
org.apache.lucene.uima.ts.TokenAnnotation, input);
assertTrue(should've been incremented , t.incrementToken());
t.close();
try {
  t.incrementToken();
  fail(should've been failed as reader is not set);
} catch (AssertionError error) {
  // ok
}
t.reset(new StringReader(hey what do you say));
assertTrue(should've been incremented , t.incrementToken());
  }

{code}

and it looks to me it's behaving correctly.
Still working on improving it and trying to catch possible corner cases.

  
 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional 

[jira] [Resolved] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor

2012-02-16 Thread James Dyer (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-2947.
--

Resolution: Fixed

Trunk Only:  r1245014  r1245018.  Thank you Mikhail (now to the next one :) ).

 DIH caching bug - EntityRunner destroys child entity processor
 --

 Key: SOLR-2947
 URL: https://issues.apache.org/jira/browse/SOLR-2947
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
  Labels: noob
 Fix For: 4.0

 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch


 My intention is fix multithread import with SQL cache. Here is the 2nd stage. 
 If I enable DocBuilder.EntityRunner flow even for single thread, it breaks 
 the pretty basic functionality: parent-child join.
 the reason is [line 473 
 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup]
  breaks children entityProcessor.
 see attachement comments for more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12434 - Still Failing

2012-02-16 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12434/

1 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard2 is not consistent, expected:52 and got:51

Stack Trace:
junit.framework.AssertionFailedError: shard2 is not consistent, expected:52 and 
got:51
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)




Build Log (for compile errors):
[...truncated 7501 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute

2012-02-16 Thread James Dyer (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer reassigned SOLR-2933:


Assignee: James Dyer

 DIHCacheSupport ignores left side of where=xid=x.id attribute
 ---

 Key: SOLR-2933
 URL: https://issues.apache.org/jira/browse/SOLR-2933
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
Priority: Minor
  Labels: noob, random
 Attachments: 
 AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, 
 SOLR-2933.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk 
 and cacheLookup. But support old one where=xid=x.id is broken by 
 [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup]
  - it never put where= sides into the context, but it revealed by 
 [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup],
  which takes just first column as a primary key. That's why all tests are 
 green.
 To reproduce the issue I need just reorder entry at [line 
 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup]
  and make desc first and picked up as a primary key. 
 To do that I propose to chose concrete map class randomly for all DIH test 
 cases at 
 [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup].
  
 I'm attaching test breaking patch and seed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute

2012-02-16 Thread James Dyer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2933:
-

Fix Version/s: 4.0
   3.6

for 3.6, we should backport the test improvement (only). 

 DIHCacheSupport ignores left side of where=xid=x.id attribute
 ---

 Key: SOLR-2933
 URL: https://issues.apache.org/jira/browse/SOLR-2933
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
Priority: Minor
  Labels: noob, random
 Fix For: 3.6, 4.0

 Attachments: 
 AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, 
 SOLR-2933.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk 
 and cacheLookup. But support old one where=xid=x.id is broken by 
 [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup]
  - it never put where= sides into the context, but it revealed by 
 [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup],
  which takes just first column as a primary key. That's why all tests are 
 green.
 To reproduce the issue I need just reorder entry at [line 
 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup]
  and make desc first and picked up as a primary key. 
 To do that I propose to chose concrete map class randomly for all DIH test 
 cases at 
 [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup].
  
 I'm attaching test breaking patch and seed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute

2012-02-16 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209462#comment-13209462
 ] 

James Dyer commented on SOLR-2933:
--

I will commit this one shortly.

 DIHCacheSupport ignores left side of where=xid=x.id attribute
 ---

 Key: SOLR-2933
 URL: https://issues.apache.org/jira/browse/SOLR-2933
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
Priority: Minor
  Labels: noob, random
 Fix For: 3.6, 4.0

 Attachments: 
 AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, 
 SOLR-2933.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk 
 and cacheLookup. But support old one where=xid=x.id is broken by 
 [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup]
  - it never put where= sides into the context, but it revealed by 
 [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup],
  which takes just first column as a primary key. That's why all tests are 
 green.
 To reproduce the issue I need just reorder entry at [line 
 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup]
  and make desc first and picked up as a primary key. 
 To do that I propose to chose concrete map class randomly for all DIH test 
 cases at 
 [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup].
  
 I'm attaching test breaking patch and seed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209474#comment-13209474
 ] 

Robert Muir commented on LUCENE-3731:
-

Right, after you reset(Reader) you set a new reader.

But the question is: is it safe to use CAS/AE after you call 
release()/destroy() on them?

Because close() is called on tokenstreams after each invocation, in other words:
{noformat}
Tokenizer t = new Tokenizer(reader);
... stuff ...
t.close();
t.reset(someOtherReader);
.. stuff ...
t.close();
{noformat}

So what does CAS.release() really mean? If it means you should not use the CAS 
again afterwards,
then we cannot have it in TokenStream.close(), and same with AE.destroy()


 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209477#comment-13209477
 ] 

Robert Muir commented on SOLR-2947:
---

Hi James: I think we should add a CHANGES.txt entry for this fix?

 DIH caching bug - EntityRunner destroys child entity processor
 --

 Key: SOLR-2947
 URL: https://issues.apache.org/jira/browse/SOLR-2947
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
  Labels: noob
 Fix For: 4.0

 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch


 My intention is fix multithread import with SQL cache. Here is the 2nd stage. 
 If I enable DocBuilder.EntityRunner flow even for single thread, it breaks 
 the pretty basic functionality: parent-child join.
 the reason is [line 473 
 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup]
  breaks children entityProcessor.
 see attachement comments for more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread David Smiley (Created) (JIRA)
Replace spatial contrib module with LSP's spatial-lucene module
---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


I propose that Lucene's spatial contrib module be replaced with the 
spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been in 
development for approximately 1 year by David Smiley, Ryan McKinley, and Chris 
Male and we feel it is ready.  LSP is here: 
http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
module is intuitively in svn/trunk/spatial-lucene/.

I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor

2012-02-16 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209489#comment-13209489
 ] 

James Dyer commented on SOLR-2947:
--

This bug was caused by SOLR-2382 which is trunk-only.  Do we need a CHANGES.txt 
entry for that?

 DIH caching bug - EntityRunner destroys child entity processor
 --

 Key: SOLR-2947
 URL: https://issues.apache.org/jira/browse/SOLR-2947
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
  Labels: noob
 Fix For: 4.0

 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch


 My intention is fix multithread import with SQL cache. Here is the 2nd stage. 
 If I enable DocBuilder.EntityRunner flow even for single thread, it breaks 
 the pretty basic functionality: parent-child join.
 the reason is [line 473 
 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup]
  breaks children entityProcessor.
 see attachement comments for more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209490#comment-13209490
 ] 

Tommaso Teofili commented on LUCENE-3731:
-

bq. But the question is: is it safe to use CAS/AE after you call 
release()/destroy() on them?

no it isn't, so you're right: those methods should not be inside the close() 
method.




 Create a analysis/uima module for UIMA based tokenizers/analyzers
 -

 Key: LUCENE-3731
 URL: https://issues.apache.org/jira/browse/LUCENE-3731
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
 LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
 LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch


 As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
 out in a separate module (modules/analysis/uima) as they can be used in plain 
 Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209495#comment-13209495
 ] 

Robert Muir commented on SOLR-2947:
---

Sorry James, my mistake, I didn't realize that! 

 DIH caching bug - EntityRunner destroys child entity processor
 --

 Key: SOLR-2947
 URL: https://issues.apache.org/jira/browse/SOLR-2947
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
  Labels: noob
 Fix For: 4.0

 Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch, 
 dih-cache-destroy-on-threads-fix.patch, dih-cache-threads-enabling-bug.patch


 My intention is fix multithread import with SQL cache. Here is the 2nd stage. 
 If I enable DocBuilder.EntityRunner flow even for single thread, it breaks 
 the pretty basic functionality: parent-child join.
 the reason is [line 473 
 entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659view=markup]
  breaks children entityProcessor.
 see attachement comments for more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209497#comment-13209497
 ] 

David Smiley commented on LUCENE-3795:
--

LSP is comprised of several modules:
* spatial-lucene: The heart of the project.
* spatial-solr: Solr support, notably field types using spatial-lucene.
* spatial-extras: An extension of spatial-lucene that uses JTS (LGPL licensed) 
for polygon support.
* spatial-demo: A demonstration web UI using OpenLayers, Solr, Wicket, and the 
other LSP modules.

The spatial-solr module of LSP can be considered in another issue following the 
conclusion of this one.  The other modules aren't being considered for 
incorporation into Lucene/Solr.

LSP is largely new code although some of it originated using chunks of the 
existing Lucene spatial contrib module and SOLR-2155 (A recursive 
PrefixTree/Trie algorithm using geohashes).  It's fair to say this is a 
superset and descendent of SOLR-2155 but with a real framework around it and 
plenty of refactorings and tests.

I ran Atlassian's Clover code coverage to get some statistics of this 
spatial-lucene module of LSP:
* LOC: 6,605, NCLOC: 3,959
* Packages: 18, Classes: 70
* Code coverage: 53%

The code coverage surprises me a little... perhaps the number is higher when 
the spatial-solr module gets involved which uses more of the classes then the 
tests do here alone.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3137) When solr.xml is persisted, you lose all system property substitution that was used.

2012-02-16 Thread Mark Miller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3137:
--

Attachment: SOLR-3137.patch

updates patch - close to done I think - I don't handle properties because of 
some oddity I have not figured out - they appear to stored un-sys-subbed, but 
then when written out they are subbed? I'm not sure they are that important to 
handle anyway?

 When solr.xml is persisted, you lose all system property substitution that 
 was used. 
 -

 Key: SOLR-3137
 URL: https://issues.apache.org/jira/browse/SOLR-3137
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-3137.patch, SOLR-3137.patch


 A lesser issue is that we also write out properties that where not originally 
 in the file with the defaults they picked up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3767) Explore streaming Viterbi search in Kuromoji

2012-02-16 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209540#comment-13209540
 ] 

Michael McCandless commented on LUCENE-3767:


I think the branch is ready to land... I'll post an applyable patch
soon.

In Mode.SEARCH the tokenizer produces the same tokens as current
trunk.

The only real end-user visible change is the addition of
Mode.SEARCH_WITH_COMPOUNDS, which can produce two paths (compound
token + its segmentation).  This mode uses the new
PositionLengthAttribute to record how long the compound token is.

In this mode, the Viterbi search first runs without penalties, but
then, if a too-long token (a token where the penalty would have been 
0) is in the best path, we effectively re-run the Viterbi under that
compound token, this time with penalties included.  If this results in
a different backtrace, we add that into the output tokens as well.

Note that this will not produce congruent results as Mode.SEARCH,
because the 2nd segmentation runs in context of the best path,
meaning the chosen best wordID before and after the compound token are
enforced in the 2nd segmentation.  Sometimes this results in still
picking only the compound token where trunk today would have split it
up.  From TestQuality, the total number of edits was 4418 vs trunk's
4828.

I didn't explore this, but, we may want to use harsher penalties in
SEARCH_WITH_COMPOUNDS mode, ie, since we're going to output the
compound as well we may as well try harder to produce the 2nd best
segmentation.

I left the default mode as Mode.SEARCH... maybe if we can somehow
run some relevance tests we can make the default SEARCH_WITH_COMPOUNDS.
But it'd also be tricky at query time...

It looks like the rolling Viterbi is a bit faster (~16%: 1460
bytes/msec vs 1700 bytes/msec on TestQuality.testSingleText).


 Explore streaming Viterbi search in Kuromoji
 

 Key: LUCENE-3767
 URL: https://issues.apache.org/jira/browse/LUCENE-3767
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/analysis
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3767.patch, LUCENE-3767.patch


 I've been playing with the idea of changing the Kuromoji viterbi
 search to be 2 passes (intersect, backtrace) instead of 4 passes
 (break into sentences, intersect, score, backtrace)... this is very
 much a work in progress, so I'm just getting my current state up.
 It's got tons of nocommits, doesn't properly handle the user dict nor
 extended modes yet, etc.
 One thing I'm playing with is to add a double backtrace for the long
 compound tokens, ie, instead of penalizing these tokens so that
 shorter tokens are picked, leave the scores unchanged but on backtrace
 take that penalty and use it as a threshold for a 2nd best
 segmentation...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1743 - Failure

2012-02-16 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1743/

1 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard1 is not consistent, expected:62 and got:63

Stack Trace:
junit.framework.AssertionFailedError: shard1 is not consistent, expected:62 and 
got:63
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)




Build Log (for compile errors):
[...truncated 10291 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12436 - Failure

2012-02-16 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12436/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard3 is not consistent, expected:59 and got:58

Stack Trace:
junit.framework.AssertionFailedError: shard3 is not consistent, expected:59 and 
got:58
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.solr.cloud.FullSolrCloudTest.checkShardConsistency(FullSolrCloudTest.java:1062)
at 
org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)




Build Log (for compile errors):
[...truncated 7674 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread David Smiley (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209588#comment-13209588
 ] 

David Smiley edited comment on LUCENE-3795 at 2/16/12 6:38 PM:
---

h3. Features
The main goals of LSP is to be a great framework to plug in spatial search 
algorithms and shape implementations.  It of course includes good 
implementations of these key abstractions.  Here are some key features, most of 
which related to using RecursivePrefixTreeStrategy with geohashes:

* Multi-valued fields
* Index shapes that have area (e.g. not just points)
  Tests have yet to be added for this.
* No special RAM caches for filtering, just standard term index
  Unlike Solr's LatLonType which needs to cache all points in RAM if the query 
shape is a circle
* Fast filtering
  Although SOLR-2155 has been proven, technically LSP hasn't.  3rd party 
anecodes re-inforce this claim.
* Multi-value sort
  Based on closest index point to center of query shape.  Distances are 
returned via the score of an LSP query.
* Specify precision of query shape and index shape
  Thereby allowing for faster filtering tunable precision
* Multiple distance algorithms:
** Spherical: Law of Cosines, Haversine, Vincenty
** Cartesian: Pythagorean Theorem
* Cartesian (2d flat)  Geospatial sphere models

h3. Todo
There are many things I want to improve and add but in my view there isn't 
anything truly making this non-committable.  Chris has raised concerns that the 
other committers will want to see benchmark results before accepting this.  
I'll leave that for you (the other committers) to decide.

And I also heard that some committers are unsure wether Lucene should have a 
spatial module at all.  However there is certainly demand for it, at least at 
the Solr level.  Furthermore, there are some non-spatial use cases of the 
spatial module.  One interesting use-case is RecursivePrefixTreeStrategy's 
(RPTS) unique ability to index shapes with area.  If you had a requirement to 
index a variable number of time durations, then unlike Lucene's trie numeric 
support in which only discrete numbers are supported, RPTS could be used with x 
being time and y being unused. Buy the way, PrefixTree and Trie are synonymous 
words.

  was (Author: dsmiley):
h3. Features
The main goals of LSP is to be a great framework to plug in spatial search 
algorithms and shape implementations.  It of course includes good 
implementations of these key abstractions.  Here are some key features, most of 
which related to using RecursivePrefixTreeStrategy with geohashes:

* Multi-valued fields
* Index shapes that have area (e.g. not just points)
  Tests have yet to be added for this.
* No special RAM caches for filtering, just standard term index
  Unlike Solr's LatLonType which needs to cache all points in RAM if the query 
shape is a circle
* Fast filtering
  Although SOLR-2155 has been proven, technically LSP hasn't.  3rd party 
anecodes re-inforce this claim.
* Multi-value sort
  Based on closest index point to center of query shape.  Distances are 
returned via the score of an LSP query.
* Specify precision of query shape and index shape
  Thereby allowing for faster filtering tunable precision
* Multiple distance algorithms:
** Spherical: Law of Cosines, Haversine, Vincency
** Cartesian: Pythagorean Theorem
* Cartesian (2d flat)  Geospatial sphere models

h3. Todo
There are many things I want to improve and add but in my view there isn't 
anything truly making this non-committable.  Chris has raised concerns that the 
other committers will want to see benchmark results before accepting this.  
I'll leave that for you (the other committers) to decide.

And I also heard that some committers are unsure wether Lucene should have a 
spatial module at all.  However there is certainly demand for it, at least at 
the Solr level.  Furthermore, there are some non-spatial use cases of the 
spatial module.  One interesting use-case is RecursivePrefixTreeStrategy's 
(RPTS) unique ability to index shapes with area.  If you had a requirement to 
index a variable number of time durations, then unlike Lucene's trie numeric 
support in which only discrete numbers are supported, RPTS could be used with x 
being time and y being unused. Buy the way, PrefixTree and Trie are synonymous 
words.
  
 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module 

LUCENE-3795 Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Smiley, David W.
I made a major proposal to Lucene to replace its spatial contrib module with 
one in LSP -- a project that Chris Male, Ryan McKinley and I have been working 
on.  In case you guys missed the JIRA issue, here it is:
https://issues.apache.org/jira/browse/LUCENE-3795
I ask for any input, assuming you have an opinion.

~ David
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3778) Create a grouping convenience class

2012-02-16 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209594#comment-13209594
 ] 

Michael McCandless commented on LUCENE-3778:


{quote}
bq. Would you also handle block (single pass) grouping with the same class...?

I think we can do this. The block grouping returns TopGroups as result.
{quote}

Nice.

{quote}
bq. I guess you'd then .getAllGroups(), .getAllGroupHeads() after .search(...)?

Yes, we need that. In the case of getAllGroups() the TopGroups#totalGroupCount 
field can be used when the user is only interested in the number of matching 
groups.
{quote}

OK.

{quote}
bq. Hmm would we try to handle Term/BytesRef and Function/MutableValue with the 
same class?

With generics?
{quote}

I think so... but I think it may get tricky.  Like, I think you should
specify up front (to GroupingSearch ctor) the required things about
your request (block join OR group field OR field + DV type OR function
VS/ctx map), setters for the numerous optional things (sort,
groupSort, getScores, getMaxScores, maxDocsPerGroup) and maybe params
to search for the per-requesty things (topNGroups, groupOffset,
withinGroupOffset).

But then the T will depend on which ctor you used...?  Not sure how
it'd work...

bq. Maybe distributed grouping needs its own class? Since the usage is 
different from a non distributed grouping.

Yeah...

Maybe we can do this for join module too!


 Create a grouping convenience class
 ---

 Key: LUCENE-3778
 URL: https://issues.apache.org/jira/browse/LUCENE-3778
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen

 Currently the grouping module has many collector classes with a lot of 
 different options per class. I think it would be a good idea to have a 
 GroupUtil (Or another name?) convenience class. I think this could be a 
 builder, because of the many options 
 (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations 
 (term/dv/function) grouping has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS

2012-02-16 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209602#comment-13209602
 ] 

Yonik Seeley commented on LUCENE-3750:
--

bq. if 1 out of N committers who have tried doing local www site builds can't 
get it to work

+1 more

I've spent the last few hours trying to get it to work on my OS-X (lion) box...
I figured out how to install the cpan perl modules (not being a perl person 
myself), and the python modules installed fine, but now the daemon just won't 
run:

{code}/opt/code/cms/build$ python --version
Python 2.7.1
/opt/code/cms/build$ export MARKDOWN_SOCKET=`pwd`/markdown.socket 
PYTHONPATH=`pwd`
/opt/code/cms/build$ python markdownd.py
/opt/code/cms/build$ No handlers could be found for logger MARKDOWN
Traceback (most recent call last):
  File markdownd.py, line 41, in module
'codehilite', 'elementid', 'footnotes', 'abbr'])
  File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 395, in 
markdown
  File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 134, in 
__init__
  File build/bdist.macosx-10.7-intel/egg/markdown/__init__.py, line 166, in 
registerExtensions
ValueError: Extension __builtin__.NoneType must be of type: 
markdown.Extension.
{code}

 Convert Versioned docs to Markdown/New CMS
 --

 Key: LUCENE-3750
 URL: https://issues.apache.org/jira/browse/LUCENE-3750
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor

 Since we are moving our main site to the ASF CMS (LUCENE-2748), we should 
 bring in any new versioned Lucene docs into the same format so that we don't 
 have to deal w/ Forrest anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209605#comment-13209605
 ] 

Simon Willnauer commented on LUCENE-3795:
-

wow this is a lot of stuff. we certainly need a code donation for this. without 
getting into details +1 from my side. I think lucene desperatly needs spatial 
support... it should be a module IMO. we should drop the stuff we have an get 
this in shape ie. into a module. I am not sure about the LGPL stuff I guess we 
should try to integrate everything else and if we really want or if there is a 
way to integrate the LGPL stuff we can take care of this later!

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209613#comment-13209613
 ] 

David Smiley commented on SOLR-2155:


If someone watching this issue has an interest in this capability winding its 
way into Solr out of the box, then I suggest you vote (and maybe watch) 
LUCENE-3795.  That issue is the first step, the subsequent step is a follow-on 
issue that will bring LSP's spatial-solr module which uses spatial-lucene 
(LUCENE-3795). I don't intend or support committing SOLR-2155 as is.  Spatial 
done-right should involve a good framework; SOLR-2155 isn't a framework and 
Lucene's existing defunct spatial-contrib module isn't good.  That's where LSP 
comes in, and LUCENE-3795 is the first step to get it incorporated into 
Lucene/Solr.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome David Smiley

2012-02-16 Thread Smiley, David W.
Thanks Mikhail.  Here's why:
https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13209613page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13209613

~ David

On Feb 10, 2012, at 1:20 PM, Mikhail Khludnev [via Lucene] wrote:

I'm joining to all congratulations above!
Btw, as well you have password, why don't commit SOLR-2155?

Regards

On Mon, Feb 6, 2012 at 10:54 AM, David Smiley (@MITRE.org) [hidden 
email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=0 wrote:
Wow! It is truly an honor to be selected by the Lucene PMC to join the
committer ranks.  You are a top notch team of coders working on one of the
most important open-source projects.

About me:

My technical background is all tiers of web development with a focus on the
middle tier and Java.  Of course I have expertise in Lucene and Solr but I
also focus on geospatial matters as well as threading / concurrency.  I like
solving hard interesting problems.

I am employed full time by The MITRE Corporation, a US federally funded
non-profit organization in which I mostly work in the defense sector. I've
been with MITRE for ~14 years. I've been fortunate lately to work on
projects that fund my open-source geospatial work.  I conduct Solr training
at MITRE (1 day and 2-day classes), and I'm sort of a search consultant
within MITRE, advising MITRE and its government clients.  For 6 months, I
have also been working part-time for OpenSource Connections as a search
consultant.

At home, I'm married with two kids: Adeline who is 10 months old (she's in
my arms sleeping as I write this) and Camille who is 2 years 10 months old.
I don't know how I found the time to write a book, but now that it's done,
I'm on full parental duty when at home.  For fun, I like to follow Starcraft
2 professional e-sports.  It's conveniently something I can do while I hold
a baby; playing the game isn't, unfortuantely.

I look forward to meeting you all at Lucene Revolution in May!  I live close
by in Lowell.

Cheers,
 David Smiley

-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Welcome-David-Smiley-tp3717248p3718969.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: [hidden 
email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=1
For additional commands, e-mail: [hidden 
email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=2




--
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com/[hidden 
email]x-msg://154/user/SendEmail.jtp?type=nodenode=3733295i=3




If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Welcome-David-Smiley-tp3717248p3733295.html
To unsubscribe from Welcome David Smiley, click 
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3717248code=RFNNSUxFWUBtaXRyZS5vcmd8MzcxNzI0OHwxMDE2NDI2OTUw.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209622#comment-13209622
 ] 

David Smiley commented on LUCENE-3795:
--

What constitutes a code donation?  By the way, I've gone through the proper 
channels with my employer with regard to SOLR-2155 and LSP.  MITRE has no 
copyright on this code; I've marked it all as ASF.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209630#comment-13209630
 ] 

Hoss Man commented on LUCENE-3792:
--

StrawMan suggestion off the top of my head:

* rename NOT_ANALYZED to something like KEYWORD_ANALYZED
* document KEYWORD_ANALYZED as being a convenience flag (and/or optimization?) 
for achieving equivalent behavior as using PerFieldAnalyzer with 
KeywordAnalyzer for this field, and keep using / re-word rmuir's patch warning 
to make it clear that if you use this at index time, any attempts to construct 
queries against it using the QueryParser will need KeywordAnalyzer.

...would that flag name == analyzer name equivalence help people remember not 
to get trapped by this?

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] Lucene.Net Blog

2012-02-16 Thread Prescott Nasser




Hey All, We've got a blog up and running: https://blogs.apache.org/lucenenet/. 
Right now we are taking the latest 3 articles and those are being posted onto 
our main Lucene.Net as news. But I'd like to try and get more regular content 
up on the blog. If you happen to write an article (or want to write an article) 
about Lucene.Net, we'd like to have it on the blog (or at least a slug for your 
article) - if anyone is interested just shoot us an email here. ~Prescott   
   

[jira] [Assigned] (SOLR-3033) numberToKeep on replication handler does not work with backupAfter

2012-02-16 Thread James Dyer (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer reassigned SOLR-3033:


Assignee: James Dyer

 numberToKeep on replication handler does not work with backupAfter
 --

 Key: SOLR-3033
 URL: https://issues.apache.org/jira/browse/SOLR-3033
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.5
 Environment: openjdk 1.6, linux 3.x
Reporter: Torsten Krah
Assignee: James Dyer
 Attachments: SOLR-3033.patch


 Configured my replication handler like this:
requestHandler name=/replication class=solr.ReplicationHandler 
lst name=master
str name=replicateAfterstartup/str
str name=replicateAftercommit/str
str name=replicateAfteroptimize/str
str 
 name=confFileselevate.xml,schema.xml,spellings.txt,stopwords.txt,stopwords_de.txt,stopwords_en.txt,synonyms_de.txt,synonyms.txt/str
str name=backupAfteroptimize/str
str name=numberToKeep1/str
  /lst
/requestHandler
 So after optimize a snapshot should be taken, this works. But numberToKeep is 
 ignored, snapshots are increasing with each call to optimize and are kept 
 forever. Seems this settings have no effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2933) DIHCacheSupport ignores left side of where=xid=x.id attribute

2012-02-16 Thread Mikhail Khludnev (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209648#comment-13209648
 ] 

Mikhail Khludnev commented on SOLR-2933:


Great, James! Thank you. 

Let me refresh SOLR-3011 patch at next week, and I also would like to think 
about same thread-proof paging for plain JDBC EntityProcessor (w/o caches).




 DIHCacheSupport ignores left side of where=xid=x.id attribute
 ---

 Key: SOLR-2933
 URL: https://issues.apache.org/jira/browse/SOLR-2933
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Affects Versions: 4.0
Reporter: Mikhail Khludnev
Assignee: James Dyer
Priority: Minor
  Labels: noob, random
 Fix For: 3.6, 4.0

 Attachments: 
 AbstractDataImportHandlerTestCase.java-choose-map-randomly.patch, 
 SOLR-2933.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 DIHCacheSupport introduced at SOLR-2382 uses new config attributes cachePk 
 and cacheLookup. But support old one where=xid=x.id is broken by 
 [DIHCacheSupport.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DIHCacheSupport.java?view=markup]
  - it never put where= sides into the context, but it revealed by 
 [SortedMapBackedCache.init|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SortedMapBackedCache.java?view=markup],
  which takes just first column as a primary key. That's why all tests are 
 green.
 To reproduce the issue I need just reorder entry at [line 
 219|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/TestCachedSqlEntityProcessor.java?revision=1201659view=markup]
  and make desc first and picked up as a primary key. 
 To do that I propose to chose concrete map class randomly for all DIH test 
 cases at 
 [createMap()|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/test/org/apache/solr/handler/dataimport/AbstractDataImportHandlerTestCase.java?revision=1149600view=markup].
  
 I'm attaching test breaking patch and seed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3109) Rename FieldsConsumer to InvertedFieldsConsumer

2012-02-16 Thread Iulius Curt (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iulius Curt updated LUCENE-3109:


Attachment: LUCENE-3109.patch

Attached a patch with the refactoring of Fields, FieldsProducer, FieldsConsumer 
and any other related classes.
It turned out to be pretty ample (also affected Solr)

Please give some feedback if something is wrong.

 Rename FieldsConsumer to InvertedFieldsConsumer
 ---

 Key: LUCENE-3109
 URL: https://issues.apache.org/jira/browse/LUCENE-3109
 Project: Lucene - Java
  Issue Type: Task
  Components: core/codecs
Affects Versions: 4.0
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3109.patch


 The name FieldsConsumer is missleading here it really is an 
 InvertedFieldsConsumer and since we are extending codecs to consume 
 non-inverted Fields we should be clear here. Same applies to Fields.java as 
 well as FieldsProducer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS

2012-02-16 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209655#comment-13209655
 ] 

Yonik Seeley commented on LUCENE-3750:
--

Heh - and I just tried the bookmarklet, and it crashed crome as soon as I tried 
to do an edit

 Convert Versioned docs to Markdown/New CMS
 --

 Key: LUCENE-3750
 URL: https://issues.apache.org/jira/browse/LUCENE-3750
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor

 Since we are moving our main site to the ASF CMS (LUCENE-2748), we should 
 bring in any new versioned Lucene docs into the same format so that we don't 
 have to deal w/ Forrest anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3138) Add node roles to core admin handler 'create core' and solrj.

2012-02-16 Thread Mark Miller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3138:
--

Attachment: SOLR-3138.patch

simple patch - ill commit it shortly

 Add node roles to core admin handler 'create core' and solrj.
 -

 Key: SOLR-3138
 URL: https://issues.apache.org/jira/browse/SOLR-3138
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3138.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12437 - Still Failing

2012-02-16 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12437/

5 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeySafeLeaderTest

Error Message:
ERROR: SolrIndexSearcher opens=39 closes=11

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=39 
closes=11
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:152)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:76)


REGRESSION:  
org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.testDistribSearch

Error Message:
null

Stack Trace:
org.apache.solr.common.cloud.ZooKeeperException: 
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:123)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:133)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:104)
at 
org.apache.solr.cloud.FullSolrCloudTest.indexDoc(FullSolrCloudTest.java:464)
at 
org.apache.solr.BaseDistributedSearchTestCase.indexr(BaseDistributedSearchTestCase.java:283)
at 
org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.addUpdateDelete(FullSolrCloudDistribCmdsTest.java:200)
at 
org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.doTest(FullSolrCloudDistribCmdsTest.java:74)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:664)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
Caused by: java.util.concurrent.TimeoutException: Could not connect to 
ZooKeeper 127.0.0.1:27995/solr within 1 ms
at 
org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:129)
at 
org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:142)
at 
org.apache.solr.common.cloud.SolrZkClient.init(SolrZkClient.java:90)
at 
org.apache.solr.common.cloud.ZkStateReader.init(ZkStateReader.java:137)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.connect(CloudSolrServer.java:108)


REGRESSION:  org.apache.solr.cloud.OverseerTest.testDoubleAssignment

Error Message:
KeeperErrorCode = NoNode for /solr/node_states/localhost:89825_solr

Stack Trace:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /solr/node_states/localhost:89825_solr
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:734)
at 
org.apache.solr.common.cloud.SolrZkClient$2.execute(SolrZkClient.java:166)
at 
org.apache.solr.common.cloud.SolrZkClient$2.execute(SolrZkClient.java:163)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
at 
org.apache.solr.common.cloud.SolrZkClient.delete(SolrZkClient.java:163)
at 
org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:152)
at 
org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:150)
at 
org.apache.solr.cloud.AbstractZkTestCase.tryCleanPath(AbstractZkTestCase.java:150)
at 
org.apache.solr.cloud.AbstractZkTestCase.tryCleanSolrZkNode(AbstractZkTestCase.java:142)
at 
org.apache.solr.cloud.OverseerTest.testDoubleAssignment(OverseerTest.java:554)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:700)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:599)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:504)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:562)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.OverseerTest

Error Message:
ERROR: SolrZkClient opens=209 closes=206

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrZkClient opens=209 closes=206
at 

[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209693#comment-13209693
 ] 

Michael McCandless commented on LUCENE-3776:


Thanks Shai.

bq. with these changes, if the app passes an IndexReader that is not 
DirectoryReader, it will get ClassCastException (if asserts are disabled).

Hang on -- SM now takes either IW or Directory, from which we always
pull a DirectoryReader, right (we call DR.open ourselves)?  Won't we
always have a DR in SM...?

Hmm, or do you mean the SearcherFactory could make some other
reader...?  Hmm maybe we should have a hard check for that
(SearcherFactory shouldn't do that...?)

bq. About close() – do you think it'll be better to keep close() final, and 
introduce a new protected closeResource()/closeInternal() that NRTManager can 
override? That way, RefManagers won't accidentally override close() and forget 
to call super.close()?

Good idea... I'll add afterClose (matches afterRefresh);

bq. About afterRefresh() – I'll admit that first I didn't understand why you 
need it. Previously, it was used to warm an IndexSearcher, but now we say it's 
the responsibility of SearcherFactory. I can see why it's useful for 
NRTManager, and it might even help me in LUCENE-3793 ! Do you think that we 
should declare that it can throw IOE? I know that if I'll use it in 
LUCENE-3793, I'll need that and I'd hate to throw RuntimeException. NRTManager 
can still override and not declare that. I'm just thinking that since almost 
all methods declare throwing IOE, it won't be odd if we declare it too on 
afterRefresh(), and it's not unlikely that afterRefresh() will do something 
that throws exceptions.

Good, I'll add throws IOE.

{quote}
 About openIfNeeded:
Can you cast to DirectoryReader once?
{quote}

Will do.

bq. I don't know if the assert is better than a ClassCastException ... with how 
the code is written, ClassCastException is better than assert because at least 
it will tell the user what went wrong?

I *think* there's no way a non-DirReader can get into NRTManager (like
SM), except for SearcherFactory.

bq. How critical it is to declare newSearcher final? If you didn't, you could 
init it to null, and only change if newReader != null. Saving 4 lines of code 
(improves readability IMO – something that I know you care about ).

Not critical!  Good idea...


 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3776.patch


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209697#comment-13209697
 ] 

Robert Muir commented on LUCENE-3792:
-

Hossman I think KEYWORD_ANALYZED is the ideal name for 3.x actually. I think in 
combination with the javadocs it would be more clear.

This still leaves the question for trunk (currently StringField):
positives are that its actually a nice name, concise and to the point.
another positive is that StringField omits things like positions, and in trunk 
we don't silently fail if you form a phrase from this.

one negative is that both StringField and TextField confusingly take String in 
their ctors, (I've chosen the wrong one myself before on accident).

Basically to me, this is a combination of traps. Trunk is somewhat better 
because it throws exceptions for positional queries if
you actually excluded positions...

in all cases in 3.x, the wrong 'configuration' here creates a situation where 
the user just 'does not get results' and they have
no idea why... despite the fact they used the same Analyzer at query-time and 
index-time like a good user. thats what I find so frustrating.

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3776:
---

Attachment: LUCENE-3776.patch

New patch folding in Shai's suggestions (thanks!).

I didn't yet add a hard check for an evil SearcherFactory...

 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3776.patch, LUCENE-3776.patch


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209707#comment-13209707
 ] 

Robert Muir commented on LUCENE-3795:
-

Simon do we really need a code grant here? 

Its my understanding (correct me if i am wrong): the developers involved 
(David, Ryan, Chris) are all committers 
with iCLA on file, so is it really any different than any other patch from that 
perspective?

As far as LGPL, according to David's description and the title of this jira 
issue (possible i did not interpret it correctly,
correct me if so), the he wants to replace lucene/contrib/spatial with the 
spatial-lucene project, and that
it has no LGPL ties at all, (only spatial-extras does).

Without looking at any code myself, if thats really the case I'm +1 on 
principle because it means we basically
have an improved spatial module for lucene core with no catch at all. The 
current code has not seen much maintenance.

(And i agree, we should be shooting for a proper module/ here, not a contrib).


 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3777) trapping overloaded ctors/setters in Field/NumericField/DocValuesField

2012-02-16 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3777:
---

Attachment: LUCENE-3777.patch

Patch, splitting NF into Int/Long/Float/DoubleField, and changing
Field.setValue(T value) - Field.setTValue(T value).

Tests pass... I'd like to commit this first (big, rote patch, will
conflict soon) and then do DocValuesField separately...


 trapping overloaded ctors/setters in Field/NumericField/DocValuesField
 --

 Key: LUCENE-3777
 URL: https://issues.apache.org/jira/browse/LUCENE-3777
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Michael McCandless
Priority: Blocker
 Attachments: LUCENE-3777.patch


 In trunk, these apis let you easily create a field, but my concern is this:
 {code}
 public NumericField(String name, int value)
 public NumericField(String name, long value)
 ..
 public Field(String name, int value, FieldType type)
 public Field(String name, long value, FieldType type)
 ..
 public void setValue(int value)
 public void setValue(long value)
 ..
 public DocValuesField(String name, int value, DocValues.Type docValueType)
 public DocValuesField(String name, long value, DocValues.Type docValueType)
 {code}
 I really don't like overloaded ctors/setters where the compiler can 
 type-promote you,
 I think it makes the apis hard to use.
 Instead for the setters I think we sohuld have setIntValue, setLongValue, ...
 For the ctors, I see two other options:
 # factories like DocValuesField.newIntField()
 # subclasses like IntField
 I don't have any patch for this, but I think we should discuss and fix before 
 these apis are released.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3138) Add node roles to core admin handler 'create core' and solrj.

2012-02-16 Thread Mark Miller (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3138.
---

Resolution: Fixed

 Add node roles to core admin handler 'create core' and solrj.
 -

 Key: SOLR-3138
 URL: https://issues.apache.org/jira/browse/SOLR-3138
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3138.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3750) Convert Versioned docs to Markdown/New CMS

2012-02-16 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209794#comment-13209794
 ] 

Mark Miller commented on LUCENE-3750:
-

I'm in a similar boat with site updates at the moment - while I struggled 
through the setup in the past, I had things working smoothly at one point - but 
since I've updated recently, I can no longer build the site - i get two errors 
about the JIRA additions in the sidebar.

{noformat}

. at /Users/markrmiller/Workspaces/lucid/cms/lib/view.pm line 46
File content/solr/index.mdtext had processing errors: Error while rendering 
output to string
 get http://s.apache.org/solrjira failed.

. at /Users/markrmiller/Workspaces/lucid/cms/lib/view.pm line 46
File content/core/index.mdtext had processing errors: Error while rendering 
output to string
 get http://s.apache.org/corejira failed.
{noformat}

 Convert Versioned docs to Markdown/New CMS
 --

 Key: LUCENE-3750
 URL: https://issues.apache.org/jira/browse/LUCENE-3750
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor

 Since we are moving our main site to the ASF CMS (LUCENE-2748), we should 
 bring in any new versioned Lucene docs into the same format so that we don't 
 have to deal w/ Forrest anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2907) java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='ITEM_ID, CATEGORY_ID'

2012-02-16 Thread Adam Lane (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209845#comment-13209845
 ] 

Adam Lane commented on SOLR-2907:
-

Upgraded to 3.5 and confirmed same problem.

 java.lang.IllegalArgumentException: deltaQuery has no column to resolve to 
 declared primary key pk='ITEM_ID, CATEGORY_ID'
 -

 Key: SOLR-2907
 URL: https://issues.apache.org/jira/browse/SOLR-2907
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler, Schema and Analysis
Affects Versions: 3.4
Reporter: Alan Baker

 We are using solr for our site and ran into this error in our own schema and 
 I was able to reproduce it using the dataimport example code in the solr 
 project.  We do not get this error in SOLR 1.4 only started seeing it as we 
 are working to upgrade to 3.4.0.  It fails when delta-importing linked tables.
 Complete trace:
 Nov 18, 2011 5:21:02 PM org.apache.solr.handler.dataimport.DataImporter 
 doDeltaImport
 SEVERE: Delta Import Failed
 java.lang.IllegalArgumentException: deltaQuery has no column to resolve to 
 declared primary key pk='ITEM_ID, CATEGORY_ID'
   at 
 org.apache.solr.handler.dataimport.DocBuilder.findMatchingPkColumn(DocBuilder.java:849)
   at 
 org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:900)
   at 
 org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:879)
   at 
 org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:285)
   at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:179)
   at 
 org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:390)
   at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:429)
   at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
 I used this dataConfig from the wiki on the data import:
 dataConfig
 dataSource driver=org.hsqldb.jdbcDriver 
 url=jdbc:hsqldb:./example-DIH/hsqldb/ex user=sa /
 document
entity  name=item pk=ID 
   query=select * from item 
 deltaImportQuery=select * from item where 
 ID=='${dataimporter.delta.id}'
 deltaQuery=select id from item where last_modified gt; 
 '${dataimporter.last_index_time}'
 entity name=item_category pk=ITEM_ID, CATEGORY_ID
 query=select CATEGORY_ID from item_category where 
 ITEM_ID='${item.ID}'
 deltaQuery=select ITEM_ID, CATEGORY_ID from 
 item_category where last_modified  '${dataimporter.last_index_time}'
 parentDeltaQuery=select ID from item where 
 ID=${item_category.ITEM_ID}
 
   entity name=category pk=ID
 query=select DESCRIPTION as cat from category where 
 ID = '${item_category.CATEGORY_ID}'
 deltaQuery=select ID from category where 
 last_modified gt; '${dataimporter.last_index_time}'
 parentDeltaQuery=select ITEM_ID, CATEGORY_ID from 
 item_category where CATEGORY_ID=${category.ID}/
   /entity
 /entity
 
 /document
 /dataConfig
 To reproduce use the data config from above and set the dataimport.properties 
 last update times to before the last_modifed date in the example data.  I my 
 case I had to set the year to 1969.  Then run a delta-import and the 
 exception occurs.  Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3137) When solr.xml is persisted, you lose all system property substitution that was used.

2012-02-16 Thread Mark Miller (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3137.
---

Resolution: Fixed

 When solr.xml is persisted, you lose all system property substitution that 
 was used. 
 -

 Key: SOLR-3137
 URL: https://issues.apache.org/jira/browse/SOLR-3137
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-3137.patch, SOLR-3137.patch


 A lesser issue is that we also write out properties that where not originally 
 in the file with the defaults they picked up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2354) CoreAdminRequest#createCore should allow you to specify the data dir

2012-02-16 Thread Mark Miller (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-2354.
---

Resolution: Invalid

 CoreAdminRequest#createCore should allow you to specify the data dir
 

 Key: SOLR-2354
 URL: https://issues.apache.org/jira/browse/SOLR-2354
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3131) details command fails when a replication is forced with a fetchIndex command on a non-slave server

2012-02-16 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209847#comment-13209847
 ] 

Mark Miller commented on SOLR-3131:
---

committed to trunk - I'll add changes and back port to 3.6 as well.

 details command fails when a replication is forced with a fetchIndex 
 command on a non-slave server
 --

 Key: SOLR-3131
 URL: https://issues.apache.org/jira/browse/SOLR-3131
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.5
Reporter: Tomás Fernández Löbbe
Assignee: Mark Miller
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-3131.patch


 Steps to reproduce the problem:
 1) Start a master Solr instance (called A)
 2) Start a Solr instance with replication handler configured, but with no 
 slave configuration. (called B)
 3) Issue the request 
 http://B:port/solr/replication?command=fetchindexmasterUrl=http://A:port/solr/replication
 4) While B is fetching the index, issue the request: 
 http://B:port/solr/replication?command=details
 Expected behavior: See the replication details as usual.
 Getting an exception instead:
 java.lang.NullPointerException
   at 
 org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:447)
   at 
 org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:611)
   at 
 org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:211)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1523)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3131) details command fails when a replication is forced with a fetchIndex command on a non-slave server

2012-02-16 Thread Mark Miller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3131:
--

Affects Version/s: (was: 4.0)
   3.5
Fix Version/s: 4.0
   3.6

 details command fails when a replication is forced with a fetchIndex 
 command on a non-slave server
 --

 Key: SOLR-3131
 URL: https://issues.apache.org/jira/browse/SOLR-3131
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.5
Reporter: Tomás Fernández Löbbe
Assignee: Mark Miller
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-3131.patch


 Steps to reproduce the problem:
 1) Start a master Solr instance (called A)
 2) Start a Solr instance with replication handler configured, but with no 
 slave configuration. (called B)
 3) Issue the request 
 http://B:port/solr/replication?command=fetchindexmasterUrl=http://A:port/solr/replication
 4) While B is fetching the index, issue the request: 
 http://B:port/solr/replication?command=details
 Expected behavior: See the replication details as usual.
 Getting an exception instead:
 java.lang.NullPointerException
   at 
 org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:447)
   at 
 org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:611)
   at 
 org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:211)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1523)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Commented

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209862#comment-13209862
 ] 

Jan Høydahl commented on LUCENE-3795:
-

Impressive piece of work! Given license stuff is ok, here is my
+1

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209888#comment-13209888
 ] 

Uwe Schindler commented on LUCENE-3795:
---

Cool work! I scanned the code quickly and it seems to fit much better than the 
current spatial!

I have some suggestions regarding performance; BooleanQuery usage and related 
inconsistency with BQ scoring (with coord) in the different strategies; also 
found some caching problems (AtomicReader is key to cache not 
AtomicReader.getCoreCacheKey, so new deleted docs after reopen invalidate the 
cache), but I would prefer to discuss that here once the patch is provided on 
Lucene's JIRA.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Chris Male (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209966#comment-13209966
 ] 

Chris Male commented on LUCENE-3795:


Huge +1

Thanks so much David for opening this issue and getting the code to a point 
where it can be contributed.  I'm really excited to see this brought into the 
fold and glad to see support from others.

{quote}
As far as LGPL, according to David's description and the title of this jira 
issue (possible i did not interpret it correctly,
correct me if so), the he wants to replace lucene/contrib/spatial with the 
spatial-lucene project, and that
it has no LGPL ties at all, (only spatial-extras does).
{quote}

Absolutely.  The portion of the codebase which uses LGPL code is entirely 
optional and decoupled from the rest of the code.  From a functional 
perspective, as David says, its only really related to polygon support which is 
hugely powerful but can exist somewhere else if needs be.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread Chris Male (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209971#comment-13209971
 ] 

Chris Male commented on LUCENE-3795:


{quote}
but I would prefer to discuss that here once the patch is provided on Lucene's 
JIRA.
{quote}

Is it best to create a patch here and iterate on any problems, or create a 
branch and work through them there?

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2332) TikaEntityProcessor retrieves only File Names from Zip extraction

2012-02-16 Thread Lance Norskog (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210033#comment-13210033
 ] 

Lance Norskog commented on SOLR-2332:
-

Unpacking a zip file is a very narrow, focused operation. This could also be 
done with a separate UpdateRequestHandler that does nothing but unpack zip 
files. It would use the basic JDK zip file code, not Tika. You configure the 
Tika handler beneath it. 

Another use case is a ZIP file full of solr update xml files, which TIKA does 
not know about. To do this, you want an UpdateRequestHandler stack like this: 
zip unpacker - XmlUpdateRequestHandler


 TikaEntityProcessor retrieves only File Names from Zip extraction
 -

 Key: SOLR-2332
 URL: https://issues.apache.org/jira/browse/SOLR-2332
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Jayendra Patil
 Fix For: 3.6, 4.0

 Attachments: SOLR-2332.patch, solr-word.zip


 Extraction of Zip files using TikaEntityProcessor results in only names of 
 file.
 It does not extract the contents of the Files in the Zip

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3776) NRTManager shouldn't expose its private SearcherManager

2012-02-16 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210064#comment-13210064
 ] 

Shai Erera commented on LUCENE-3776:


bq. Hang on – SM now takes either IW or Director

You're right, I missed that. For some reason I had the impression it takes an 
IR, which is obviously wrong, since it won't be allowed to close it.

bq. do you mean the SearcherFactory could make some other reader

I'm less worried about that. We give SF an IndexReader, I can only expect that 
it will return an IndexSearcher on top of it. Maybe we can assert that 
IndexSearcher.getIndexReader == newReader in refreshIfNeeded?

bq. I think there's no way a non-DirReader can get into NRTManager 

You're right. If you keep the assert, maybe add a nice msg to it?

bq. I didn't yet add a hard check for an evil SearcherFactory...

I think that's ok to assume that SearcherFactory is not evil. Maybe the assert 
I suggested above would be enough?

 NRTManager shouldn't expose its private SearcherManager
 ---

 Key: LUCENE-3776
 URL: https://issues.apache.org/jira/browse/LUCENE-3776
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3776.patch, LUCENE-3776.patch


 Spinoff from LUCENE-3769.
 To actually obtain an IndexSearcher from NRTManager, it's a 2-step process 
 now.
 You must .getSearcherManager(), then .acquire() from the returned 
 SearcherManager.
 This is very trappy... because if the app incorrectly calls maybeReopen on 
 that private SearcherManager (instead of NRTManager.maybeReopen) then it can 
 unexpectedly cause threads to block forever, waiting for the necessary gen to 
 become visible.  This will be hard to debug... I don't like creating trappy 
 APIs.
 Hopefully once LUCENE-3761 is in, we can fix NRTManager to no longer expose 
 its private SM, instead subclassing ReferenceManaager.
 Or alternatively, or in addition, maybe we factor out a new interface 
 (SearcherProvider or something...) that only has acquire and release methods, 
 and both NRTManager and ReferenceManager/SM impl that, and we keep 
 NRTManager's SM private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3096) Add book information to the new website

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210066#comment-13210066
 ] 

David Smiley commented on SOLR-3096:


I committed this just now.  It'd be nice to have fellow committers comment on 
the results before it winds up getting published from staging.

http://lucene.staging.apache.org/solr/books.html

 Add book information to the new website
 ---

 Key: SOLR-3096
 URL: https://issues.apache.org/jira/browse/SOLR-3096
 Project: Solr
  Issue Type: Task
Reporter: David Smiley
Assignee: David Smiley
 Attachments: website_books.patch


 The attached patch modifies the new website design to incorporate the book 
 information.  It ads a header mantle slideshow entry with both book images 
 (just the 2 current books), and it adds a book page with the 3 books 
 published (this includes the 1st edition that is out of date now).  The image 
 files referenced are the same actual binary images on the current website by 
 I chose a more consistent naming convention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3096) Add book information to the new website

2012-02-16 Thread Chris Male (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210068#comment-13210068
 ] 

Chris Male commented on SOLR-3096:
--

Did a quick glance, looks great, +1

 Add book information to the new website
 ---

 Key: SOLR-3096
 URL: https://issues.apache.org/jira/browse/SOLR-3096
 Project: Solr
  Issue Type: Task
Reporter: David Smiley
Assignee: David Smiley
 Attachments: website_books.patch


 The attached patch modifies the new website design to incorporate the book 
 information.  It ads a header mantle slideshow entry with both book images 
 (just the 2 current books), and it adds a book page with the 3 books 
 published (this includes the 1st edition that is out of date now).  The image 
 files referenced are the same actual binary images on the current website by 
 I chose a more consistent naming convention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210070#comment-13210070
 ] 

David Smiley commented on LUCENE-3795:
--

FYI the code coverage figure is erroneous, Clover didn't recognize some inner 
classes extending other tests as tests.  Using IntelliJ IDEA Ultimate's 
built-in coverage, it's 63% (as counted per line), and I believe its higher 
once the spatial-solr module is brought into the mix which has a bunch of tests.

Uwe, I'm very interested in your input on anything to make the code better.

Given the volume of code, I believe a feature branch makes the most sense 
instead of a humungous patch file.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3792) Remove StringField

2012-02-16 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210105#comment-13210105
 ] 

Robert Muir commented on LUCENE-3792:
-

{quote}
NOT_ANALYZED has two variants - with and without norms.
{quote}

You are right, I forgot about this. For NOT_ANALYZED with norms, we should 
probably just throw CoderMalfunctionError()

 Remove StringField
 --

 Key: LUCENE-3792
 URL: https://issues.apache.org/jira/browse/LUCENE-3792
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3792_javadocs_3x.patch, 
 LUCENE-3792_javadocs_3x.patch


 Often on the mailing list there is confusion about NOT_ANALYZED.
 Besides being useless (Just use KeywordAnalyzer instead), people trip up on 
 this
 not being consistent at query time (you really need to configure 
 KeywordAnalyzer for the field 
 on your PerFieldAnalyzerWrapper so it will do the same thing at query time... 
 oh wait
 once you've done that, you dont need NOT_ANALYZED).
 So I think StringField is a trap too for the same reasons, just under a 
 different name, lets remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3796) Disallow setBoost() on StringField, throw exception if boosts are set if norms are omitted

2012-02-16 Thread Robert Muir (Created) (JIRA)
Disallow setBoost() on StringField, throw exception if boosts are set if norms 
are omitted
--

 Key: LUCENE-3796
 URL: https://issues.apache.org/jira/browse/LUCENE-3796
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
Priority: Blocker
 Fix For: 4.0


Occasionally users are confused why index-time boosts are not applied to their 
norms-omitted fields.

This is because we silently discard the boost: there is no reason for this!

The most absurd part: in 4.0 you can make a StringField and call setBoost and 
nothing complains... (more reasons to remove StringField totally in my opinion)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org