[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-17 Thread Christopher Currens (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034961#comment-13034961
 ] 

Christopher Currens commented on PYLUCENE-9:


We can close it.  Thanks for the help.

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-15 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13033877#comment-13033877
 ] 

Andi Vajda commented on PYLUCENE-9:
---

Hi Christopher,
Have you elucidated this yet ?
Can this bug be closed or is there still something to be done for it ?

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-10 Thread Christopher Currens (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031259#comment-13031259
 ] 

Christopher Currens commented on PYLUCENE-9:


I've posted a question to the java-lucene list, however, I'm sure it won't help 
at all.  The simple fact is that the lucene 3.0 jar parses the query as 
ft:calendar item msg.  The *same* lucene 3.0 jar when invoked from 
pylucene, produces ft:calendar item ? msg for me, on both windows and 
ubuntu boxes.

I suppose this just might be an issue with jcc?  I've been able to produce this 
both on my boxes at work, and my box at home, both producing the incorrect 
output.  Perhaps I'm most curious if this can be reproduced by any developer 
for pylucene, or if its just some crazy environment issue happening on my boxes 
and everyone else I know.

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-10 Thread Christopher Currens (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031284#comment-13031284
 ] 

Christopher Currens commented on PYLUCENE-9:


Hmm, the code I have is nearly identical, and when I pull it out of the 
contained code, it behaves as it should.  I can't post the whole code, but the 
issue must be that there's a lingering Version.LUCENE_24 somewhere I suppose.  
I'll try figuring it out on my own, I'm glad to see its something idiotic I've 
done. :)

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-05 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029666#comment-13029666
 ] 

Andi Vajda commented on PYLUCENE-9:
---

Are you sure you're comparing the right versions ?

Lucene.Net is quite behind Java Lucene and in more recent versions lots of 
things changed.
For instance, trying different Version instances gives different results, 
notably LUCENE_24 works as you seem to expect:
   qp = QueryParser(Version.LUCENE_29, ft, 
StandardAnalyzer(Version.LUCENE_29))
   qp.parse('Calendar Item as Msg')
  Query: ft:calendar item ? msg   -- the 'as' stop word gets 
replaced by a hole as expected in that version

   qp = QueryParser(Version.LUCENE_24, ft, 
StandardAnalyzer(Version.LUCENE_24))
   qp.parse('Calendar Item as Msg')
  Query: ft:calendar item msg  -- works as Lucene.Net 
(probably, as I've never run it)

I'm inclined to resolve this bug as INVALID unless I'm missing something here.
Please, let me know.

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-05 Thread Christopher Currens (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029674#comment-13029674
 ] 

Christopher Currens commented on PYLUCENE-9:


I was very hesitant to report this as a bug, since pylucene isn't a port, 
rather just recompiled.  I am positive I am comparing the correct versions (I'm 
a committer on Lucene.Net).  I'll show you all the configurations I've done:

Lucene.Net 2.9.2 - Valid
Lucene.Net 2.9.4 - Valid
Java Lucene (via Luke 1.0.1 (uses Lucene 2.9.4)) - Valid
Java Lucene (via Luke 3.1.0 (uses  Lucene 3.0)) - Valid
pyLucene (Lucene 2.9.2) - Invalid replaced by single Wildcard ('?')
pyLucene (Lucene 2.9.4) - Invalid replaced by single Wildcard ('?')
pyLucene (Lucene 3.0.3) - Invalid replaced by single Wildcard ('?') 

Those tests are all on the 32-bin Win-XP.  The ubuntu box I've used was using 
pyLucene w/ lucene 2.9.2.

One thing I hadn't considered, though, was to see if it can be replicated 
outside of the many machines I've used myself to test, specifically if there's 
in issue with our building of it via JCC, or something in our environment.  But 
considering I've tried it at work and at home, there's no real other place I 
can test it.

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PYLUCENE-9) QueryParser replacing stop words with wildcards

2011-05-05 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/PYLUCENE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029691#comment-13029691
 ] 

Andi Vajda commented on PYLUCENE-9:
---

Could you please ask on the java-u...@lucene.apache.org list what is actually 
the expected behavior from Java Lucene's point of view with versions 
Version.LUCENE_24, 29 and 30 passed to both the QueryParser and 
StandardAnalyzer contructors.
I remember this changing at some point but I'm not sure when. Nor do I see, 
without further investigation how PyLucene could be different there as it just 
invokes the embedded Java Lucene jar. Thanks !

 QueryParser replacing stop words with wildcards
 ---

 Key: PYLUCENE-9
 URL: https://issues.apache.org/jira/browse/PYLUCENE-9
 Project: PyLucene
  Issue Type: Bug
 Environment: Windows XP 32-bit Sp3, Ubuntu 10.04.2 LTS i686 
 GNU/Linux, jdk1.6.0_23
Reporter: Christopher Currens

 Was using query parser to build a query.  In Java Lucene (as well as 
 Lucene.Net), the query Calendar Item as Msg (quotes included), is parsed 
 properly as FullText:calendar item msg in Java Lucene and Lucene.Net.  In 
 pylucene, it is parsed as: FullText:calendar item ? msg.  This causes 
 obvious problems when comparing search results from python, java and .net.
 Initially, I thought it was the Analyzer I was using, but I've tried the 
 StandardAnalyzer and StopAnalyzer, which work properly in Java and .Net, but 
 not pylucene.
 Here is code I've used to reproduce the issue:
  from lucene import StandardAnalyzer, StopAnalyzer, QueryParser, Version
  analyzer = StandardAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30, FullText, analyzer)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
  analyzer = StopAnalyzer(Version.LUCENE_30)
  query = QueryParser(Version.LUCENE_30)
  parsedQuery = query.parse(\Calendar Item as Msg\)
  parsedQuery
 Query: FullText:calendar item ? msg
 I've noticed this in pylucene 2.9.4, 2.9.3, and 3.0.3

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira