[jira] Updated: (SOLR-2096) DIH should be able read data directly from HDFS for indexing

2010-08-30 Thread Amit Nithian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Nithian updated SOLR-2096:
---

Attachment: hdfs_reader.tar

> DIH should be able read data directly from HDFS for indexing
> 
>
> Key: SOLR-2096
> URL: https://issues.apache.org/jira/browse/SOLR-2096
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4.1
>Reporter: Amit Nithian
> Fix For: 1.4.2
>
> Attachments: hdfs_reader.tar
>
>
> DIH doesn't support reading from the hdfs:// protocol which makes it hard to 
> index data generated by a M/R job. This tarball contains a subclass of the 
> URLDataSource along with an HDFSReader that allows for this. The data is 
> assumed to be in text format and able to be processed by the 
> LineEntityProcessor.
> Here is an example DIH-Config snippet:
>type="org.apache.solr.handler.dataimport.hdfs.HDFSDataSource" 
>   baseUrl="hdfs://:9000/" encoding="UTF-8" 
>   connectionTimeout="5000" readTimeout="1"/>
>   
>  url="/part*" dataSource="queryData">
> 
>   
>   
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2096) DIH should be able read data directly from HDFS for indexing

2010-08-30 Thread Amit Nithian (JIRA)
DIH should be able read data directly from HDFS for indexing


 Key: SOLR-2096
 URL: https://issues.apache.org/jira/browse/SOLR-2096
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Affects Versions: 1.4.1
Reporter: Amit Nithian
 Fix For: 1.4.2
 Attachments: hdfs_reader.tar

DIH doesn't support reading from the hdfs:// protocol which makes it hard to 
index data generated by a M/R job. This tarball contains a subclass of the 
URLDataSource along with an HDFSReader that allows for this. The data is 
assumed to be in text format and able to be processed by the 
LineEntityProcessor.

Here is an example DIH-Config snippet:
  







-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Hudson build is back to normal : Solr-3.x #89

2010-08-30 Thread Apache Hudson Server
See 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Build failed in Hudson: Lucene-trunk #1274

2010-08-30 Thread Simon Willnauer
Seems like a clover problem - first unittest run was successful though.

On Tue, Aug 31, 2010 at 6:57 AM, Apache Hudson Server
 wrote:
> See 
>
> Changes:
>
> [rmuir] LUCENE-2629: fix analysis/icu's gennorm2 task
>
> [rmuir] LUCENE-2629: fix analysis/icu's gennorm2 task
>
> [gsingers] LUCENE-2272: fix payload near scoring/explain problem
>
> [mikemccand] fix false random test failure
>
> [simonw] LUCENE-2604: Added RegexpQuery support to QueryParser. Regular 
> expressions are now directly supported by the standard QueryParser.
>
> [mikemccand] javadoc fixes
>
> [rmuir] remove javadocs warnings
>
> [rmuir] remove dead code
>
> [rmuir] add basic tests for some untested TFC classes
>
> [rmuir] LUCENE-2627: tone down test for slowlaris
>
> --
> [...truncated 13292 lines...]
>  [javadoc] Standard Doclet version 1.5.0_22
>  [javadoc] Building tree for all the packages and classes...
>  [javadoc] 
> :43:
>  warning - Tag @link: reference not found: 
> IndexWriter#addIndexes(IndexReader[])
>  [javadoc] 
> :44:
>  warning - Tag @link: reference not found: Directory
>  [javadoc] 
> :63:
>  warning - Tag @link: reference not found: NativeFSLockFactory
>  [javadoc] 
> :44:
>  warning - Tag @link: reference not found: Directory
>  [javadoc] Building index for all the packages and classes...
>  [javadoc] 
> :44:
>  warning - Tag @link: reference not found: Directory
>  [javadoc] Building index for all classes...
>  [javadoc] Generating 
> 
>  [javadoc] Note: Custom tags that were not seen: �...@lucene.internal
>  [javadoc] 5 warnings
>      [jar] Building jar: 
> 
>     [echo] Building queries...
>
> javadocs:
>    [mkdir] Created dir: 
> 
>  [javadoc] Generating Javadoc
>  [javadoc] Javadoc execution
>  [javadoc] Loading source files for package org.apache.lucene.search...
>  [javadoc] Loading source files for package org.apache.lucene.search.regex...
>  [javadoc] Loading source files for package 
> org.apache.lucene.search.similar...
>  [javadoc] Constructing Javadoc information...
>  [javadoc] Standard Doclet version 1.5.0_22
>  [javadoc] Building tree for all the packages and classes...
>  [javadoc] 
> :35:
>  warning - Tag @link: can't find prefix in 
> org.apache.lucene.search.regex.JakartaRegexpCapabilities
>  [javadoc] 
> :36:
>  warning - Tag @link: reference not found: RegexTermEnum
>  [javadoc] 
> :36:
>  warning - Tag @link: reference not found: RegexTermEnum
>  [javadoc] 
> :33:
>  warning - Tag @link: can't find prefix in 
> org.apache.lucene.search.regex.JavaUtilRegexCapabilities
>  [javadoc] 
> :33:
>  warning - Tag @link: can't find match in 
> org.apache.lucene.search.regex.JavaUtilRegexCapabilities
>  [javadoc] 
> :36:
>  warning - Tag @link: reference not found: RegexTermEnum
>  [javadoc] 
> :36:
>  warning - Tag @link: reference not found: RegexTermEnum
>  [javadoc] 
> 

Build failed in Hudson: Lucene-trunk #1274

2010-08-30 Thread Apache Hudson Server
See 

Changes:

[rmuir] LUCENE-2629: fix analysis/icu's gennorm2 task

[rmuir] LUCENE-2629: fix analysis/icu's gennorm2 task

[gsingers] LUCENE-2272: fix payload near scoring/explain problem

[mikemccand] fix false random test failure

[simonw] LUCENE-2604: Added RegexpQuery support to QueryParser. Regular 
expressions are now directly supported by the standard QueryParser.

[mikemccand] javadoc fixes

[rmuir] remove javadocs warnings

[rmuir] remove dead code

[rmuir] add basic tests for some untested TFC classes

[rmuir] LUCENE-2627: tone down test for slowlaris

--
[...truncated 13292 lines...]
  [javadoc] Standard Doclet version 1.5.0_22
  [javadoc] Building tree for all the packages and classes...
  [javadoc] 
:43:
 warning - Tag @link: reference not found: IndexWriter#addIndexes(IndexReader[])
  [javadoc] 
:44:
 warning - Tag @link: reference not found: Directory
  [javadoc] 
:63:
 warning - Tag @link: reference not found: NativeFSLockFactory
  [javadoc] 
:44:
 warning - Tag @link: reference not found: Directory
  [javadoc] Building index for all the packages and classes...
  [javadoc] 
:44:
 warning - Tag @link: reference not found: Directory
  [javadoc] Building index for all classes...
  [javadoc] Generating 

  [javadoc] Note: Custom tags that were not seen:  @lucene.internal
  [javadoc] 5 warnings
  [jar] Building jar: 

 [echo] Building queries...

javadocs:
[mkdir] Created dir: 

  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package org.apache.lucene.search.regex...
  [javadoc] Loading source files for package org.apache.lucene.search.similar...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.5.0_22
  [javadoc] Building tree for all the packages and classes...
  [javadoc] 
:35:
 warning - Tag @link: can't find prefix in 
org.apache.lucene.search.regex.JakartaRegexpCapabilities
  [javadoc] 
:36:
 warning - Tag @link: reference not found: RegexTermEnum
  [javadoc] 
:36:
 warning - Tag @link: reference not found: RegexTermEnum
  [javadoc] 
:33:
 warning - Tag @link: can't find prefix in 
org.apache.lucene.search.regex.JavaUtilRegexCapabilities
  [javadoc] 
:33:
 warning - Tag @link: can't find match in 
org.apache.lucene.search.regex.JavaUtilRegexCapabilities
  [javadoc] 
:36:
 warning - Tag @link: reference not found: RegexTermEnum
  [javadoc] 
:36:
 warning - Tag @link: reference not found: RegexTermEnum
  [javadoc] 
:36:
 warning - Tag @link: reference not found: RegexTermEnum
  [javadoc] 


[jira] Resolved: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-2629.
-

Fix Version/s: 3.1
   4.0
   Resolution: Fixed

Committed revision 991053 (trunk) 991055 (3x)

Thanks David!

> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: gennorm2.patch, gennorm2.patch, LUCENE-2629.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2095) Hihlighter fragement/formatter for returning just the matching terms

2010-08-30 Thread Hoss Man (JIRA)
Hihlighter fragement/formatter for returning just the matching terms


 Key: SOLR-2095
 URL: https://issues.apache.org/jira/browse/SOLR-2095
 Project: Solr
  Issue Type: New Feature
  Components: highlighter
Reporter: Hoss Man


A somewhat frequent request is to have highlighter just return the terms that 
actually match, in the fields where they match.

This is not very easy/possible with the current Highlighter fragmenter and 
formatter options, so we should look into adding some specific use case 
fragmenter & formatters for this purpose.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904473#action_12904473
 ] 

Robert Muir commented on LUCENE-2629:
-

bq. I find it a strange concept to have two binary file formats, one for 
big-endian and one for little-endian, only one of which is usable. I would have 
thought that the gennorm2 program should generate the file format that works, 
no matter what machine it is run on.

I could be wrong, but I think the reason ICU's data files are endian-dependent 
is because they are designed to be very very quickly mapped into memory
(e.g. the speed at which the underlying character property data can be mapped 
into memory so that java.lang.Character becomes useful is sensitive)


> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch, gennorm2.patch, LUCENE-2629.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2629:


Attachment: LUCENE-2629.patch

Thanks David, that did the trick!

I made one small change: just in case something goes wrong it uses ${build.dir} 
for the temp file.

I'd like to commit this soon to trunk and 3x.

> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch, gennorm2.patch, LUCENE-2629.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread David Bowen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904469#action_12904469
 ] 

David Bowen commented on LUCENE-2629:
-

And by the way, I tested that it is OK to run icupkg on the file even if it is 
already big-endian.

I find it a strange concept to have two binary file formats, one for big-endian 
and one for little-endian, only one of which is usable.  I would have thought 
that the gennorm2 program should generate the file format that works, no matter 
what machine it is run on.

No doubt there are complex reasons for this design weirdness.  I know that 
sadly, some people have to still deal with EBCDIC.



> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch, gennorm2.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread David Bowen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Bowen updated LUCENE-2629:


Attachment: gennorm2.patch

Oops, I just noticed also that the tmpfile was not getting deleted.  A stupid 
typo (${gennorm.tmp} instead of ${gennorm2.tmp}).  Here's a fixed patch.  

> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch, gennorm2.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Hudson build is back to normal : Lucene-3.x #101

2010-08-30 Thread Apache Hudson Server
See 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904441#action_12904441
 ] 

Robert Muir commented on LUCENE-2629:
-

perfect, now the file can be easily regenerated... i just tested.

(i noticed for whatever strange reason the  didnt delete the utr30.tmp, 
but i'll figure it out)

Thanks a lot!

> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned LUCENE-2629:
---

Assignee: Robert Muir

> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
>Assignee: Robert Muir
> Attachments: gennorm2.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2010-08-30 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Attachment: edismax_pf_with_slop_v2.1.patch

Removed a couple unnecessary lines compared to the last version

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: edismax_pf_with_slop_v2.1.patch, 
> edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread David Bowen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Bowen updated LUCENE-2629:


Attachment: gennorm2.patch

Just a build.xml tweak.

I included a couple of extra tests for the ICUFoldingFilter, on the basis that 
more tests can't hurt.



> In modules/analysys/icu, ant gennorm2 does not work
> ---
>
> Key: LUCENE-2629
> URL: https://issues.apache.org/jira/browse/LUCENE-2629
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Reporter: David Bowen
> Attachments: gennorm2.patch
>
>
> Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
> called to convert the binary file to big-endian.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2629) In modules/analysys/icu, ant gennorm2 does not work

2010-08-30 Thread David Bowen (JIRA)
In modules/analysys/icu, ant gennorm2 does not work
---

 Key: LUCENE-2629
 URL: https://issues.apache.org/jira/browse/LUCENE-2629
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Reporter: David Bowen



Command to run gennorm2 does not work at present.  Also, icupkg needs to be 
called to convert the binary file to big-endian.

I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Assigned: (SOLR-2013) ASCIIFoldingFilter => MappingCharFilterFactory as a mapping file

2010-08-30 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-2013:


Assignee: Koji Sekiguchi

I'm going to commit the attached file (w/ perl script) to example conf 
directory of trunk and 3.x.

> ASCIIFoldingFilter => MappingCharFilterFactory as a mapping file
> 
>
> Key: SOLR-2013
> URL: https://issues.apache.org/jira/browse/SOLR-2013
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.1, Next
>Reporter: Steven Rowe
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1, Next
>
> Attachments: mapping-FoldToASCII.txt, mapping-FoldToASCII.txt
>
>
> Attached is a mapping file to provide the equivalent of ASCIIFoldingFilter 
> through the MappingCharFilterFactory.
> I'm not sure where this should go in the source tree.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2010-08-30 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Attachment: edismax_pf_with_slop_v2.patch

Updated to use more sane FieldParams class to pass fields,, boosts, and phrase 
slops instead of the bizarre Map> I was using before.

> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2010-08-30 Thread Ron Mayer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904390#action_12904390
 ] 

Ron Mayer edited comment on SOLR-2058 at 8/30/10 6:40 PM:
--

Submitted an updated patch to use a more sane FieldParams class to pass 
fields,, boosts, and phrase slops instead of the bizarre 
Map> I was using before.

  was (Author: ramayer):
Updated to use more sane FieldParams class to pass fields,, boosts, and 
phrase slops instead of the bizarre Map> I was using 
before.
  
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904386#action_12904386
 ] 

Robert Muir commented on LUCENE-2628:
-

bq. FWIW: I can ... the snag robert ran into in SOLR-2034. we don't want SolrJ 
to have a dependency on lucene-core, but it would be nice to re-use the UTF-8 
serialization code instead of duplicating it

But maybe not, the stuff in unicodeutil isnt the best there anyway as its doing 
either:
* incremental conversion [and wasting cpu updating useless offsets]
* computing terms hash as it goes [and wasting cpu computing useless hash codes]


> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904379#action_12904379
 ] 

Hoss Man commented on LUCENE-2628:
--

bq. so lets be honest, the "lucene" part boils down to: 'please delete this 
class and depend on this jar file instead.

agreed ... if commons wants to include OpenBitSet, and promote it's use to the 
java community at large, i'm all for that, but i don' really see any Lucene 
issue here at the moment.  If Commons's version of OpenBitSet takes off and 
becomes the defacto "bit set" impl people us in Java, then Lucene may want to 
reconsider it's current "no deps for core" policy and start depending on 
commons-bitset, but we aren't there yet, so we aren't there yet, so this is 
really a non-issue.


{quote}
I think a util jar is a great idea but not so we can publish it for others. As 
we modularise more, there will be utility classes that are useful across 
multiple modules. I dont think they should be stuck into lucene-core just 
because its the only consistent dependency. But I don't think OBS fits into 
this pool necessary since it really is tuned for the search func in 
lucene-core. 
{quote}
...
{quote}
Can you give a concrete example how a "utility jar" would be useful?

I didn't think so.
Can you give a concrete example how a "utility jar" would be useful? 

I didn't think so.
{quote}

FWIW: I can ... the snag robert ran into in SOLR-2034.  we don't want SolrJ to 
have a dependency on lucene-core, but it would be nice to re-use the UTF-8 
serialization code instead of duplicating it.

> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2609) Generate jar containing test classes.

2010-08-30 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904367#action_12904367
 ] 

Grant Ingersoll commented on LUCENE-2609:
-

Both trunk and 3.x of Lucene.

> Generate jar containing test classes.
> -
>
> Key: LUCENE-2609
> URL: https://issues.apache.org/jira/browse/LUCENE-2609
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.2
>Reporter: Drew Farris
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: LUCENE-2609.patch, LUCENE-2609.patch
>
>
> The test classes are useful for writing unit tests for code external to the 
> Lucene project. It would be helpful to build a jar of these classes and 
> publish them as a maven dependency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2010-08-30 Thread Ron Mayer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904352#action_12904352
 ] 

Ron Mayer commented on SOLR-2058:
-

Also wanted to note - I've been using this on a QA machine with 4 million 
documents, and it  has been working extremely well for me; with multiple 
simultaneous phrase slop.

In particular, if I use:
* a high boost (500)  on pf with slop of 0
* a moderate boost (50) on pf with a slop of 50
* a moderate boost (50) on pf2 with a slop of 0
* a low boost (10) on pf2 with a slop of 10

it's doing a *great* job of getting the most relevant document  in the #1 spot, 
and a very good job at getting the entire first page of results filled with 
highly relevant documents.



> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2010-08-30 Thread Ron Mayer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904351#action_12904351
 ] 

Ron Mayer commented on SOLR-2058:
-

Totally agree that that was a bizarre way I used of encoding the boost.

I did that on my first draft  just to minimize the impact with the rest of the 
code (where some functions were expecting the "Map" pieces).


I'll post an updated patch with a more sane class like the one you described.   
I'm new enough to the code that I'm not sure where such a class should reside.  
Any opinions?


> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2609) Generate jar containing test classes.

2010-08-30 Thread Drew Farris (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904337#action_12904337
 ] 

Drew Farris commented on LUCENE-2609:
-

Grant, I'll take a look. So I can reproduce the issue you're running into, 
which directory are you executing 'ant generate-maven-artifacts' from, 
branch_3x or branch_3x/lucene?



> Generate jar containing test classes.
> -
>
> Key: LUCENE-2609
> URL: https://issues.apache.org/jira/browse/LUCENE-2609
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.2
>Reporter: Drew Farris
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: LUCENE-2609.patch, LUCENE-2609.patch
>
>
> The test classes are useful for writing unit tests for code external to the 
> Lucene project. It would be helpful to build a jar of these classes and 
> publish them as a maven dependency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2272) PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'

2010-08-30 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904335#action_12904335
 ] 

Grant Ingersoll commented on LUCENE-2272:
-

Committed to trunk and 3.x

> PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'
> ---
>
> Key: LUCENE-2272
> URL: https://issues.apache.org/jira/browse/LUCENE-2272
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Reporter: Peter Keegan
>Assignee: Grant Ingersoll
> Attachments: payloadfunctin-patch.txt, PNQ-patch.txt, PNQ-patch1.txt, 
> PNQ-patch2.txt
>
>
> The 'explain' method in PayloadNearSpanScorer assumes the 
> AveragePayloadFunction was used. This patch adds the 'explain' method to the 
> 'PayloadFunction' interface, where the Scorer can call it. Added unit tests 
> for 'explain' and for {Min,Max}PayloadFunction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2609) Generate jar containing test classes.

2010-08-30 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904330#action_12904330
 ] 

Grant Ingersoll commented on LUCENE-2609:
-

ant generate-maven-artifacts does not appear to work.  I don't think it is the 
fault of this patch, but until it does, I can't test this.

> Generate jar containing test classes.
> -
>
> Key: LUCENE-2609
> URL: https://issues.apache.org/jira/browse/LUCENE-2609
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.2
>Reporter: Drew Farris
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: LUCENE-2609.patch, LUCENE-2609.patch
>
>
> The test classes are useful for writing unit tests for code external to the 
> Lucene project. It would be helpful to build a jar of these classes and 
> publish them as a maven dependency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2272) PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'

2010-08-30 Thread Peter Keegan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904328#action_12904328
 ] 

Peter Keegan commented on LUCENE-2272:
--

That is wierd! I hope you didn't spend too much time on it.

Thanks,
Peter



> PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'
> ---
>
> Key: LUCENE-2272
> URL: https://issues.apache.org/jira/browse/LUCENE-2272
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Reporter: Peter Keegan
>Assignee: Grant Ingersoll
> Attachments: payloadfunctin-patch.txt, PNQ-patch.txt, PNQ-patch1.txt, 
> PNQ-patch2.txt
>
>
> The 'explain' method in PayloadNearSpanScorer assumes the 
> AveragePayloadFunction was used. This patch adds the 'explain' method to the 
> 'PayloadFunction' interface, where the Scorer can call it. Added unit tests 
> for 'explain' and for {Min,Max}PayloadFunction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

2010-08-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2590:


Attachment: LUCENE-2590.patch

new patch - fixed the code dup

> Enable access to the freq information in a Query's sub-scorers
> --
>
> Key: LUCENE-2590
> URL: https://issues.apache.org/jira/browse/LUCENE-2590
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, 
> LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2272) PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'

2010-08-30 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904317#action_12904317
 ] 

Grant Ingersoll commented on LUCENE-2272:
-

Wow, weird timing, Peter.  I was just looking at this today, hoping to finish 
it and up you put a patch.

> PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'
> ---
>
> Key: LUCENE-2272
> URL: https://issues.apache.org/jira/browse/LUCENE-2272
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Reporter: Peter Keegan
>Assignee: Grant Ingersoll
> Attachments: payloadfunctin-patch.txt, PNQ-patch.txt, PNQ-patch1.txt, 
> PNQ-patch2.txt
>
>
> The 'explain' method in PayloadNearSpanScorer assumes the 
> AveragePayloadFunction was used. This patch adds the 'explain' method to the 
> 'PayloadFunction' interface, where the Scorer can call it. Added unit tests 
> for 'explain' and for {Min,Max}PayloadFunction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2272) PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'

2010-08-30 Thread Peter Keegan (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Keegan updated LUCENE-2272:
-

Attachment: PNQ-patch2.txt

Well, this is embarrassing. 

I used Eclipse to generate the patch, and didn't exclude an existing text file 
in the project that already contained the patch. I have regenerated the patch 
against the trunk, which also restored the generics and missing annotations. 
Sorry for the confusion.

I also changed my JIRA e-mail so I don't miss updates on issues sent to me vs. 
the java-dev list.

> PayloadNearQuery has hardwired explanation for 'AveragePayloadFunction'
> ---
>
> Key: LUCENE-2272
> URL: https://issues.apache.org/jira/browse/LUCENE-2272
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Reporter: Peter Keegan
>Assignee: Grant Ingersoll
> Attachments: payloadfunctin-patch.txt, PNQ-patch.txt, PNQ-patch1.txt, 
> PNQ-patch2.txt
>
>
> The 'explain' method in PayloadNearSpanScorer assumes the 
> AveragePayloadFunction was used. This patch adds the 'explain' method to the 
> 'PayloadFunction' interface, where the Scorer can call it. Added unit tests 
> for 'explain' and for {Min,Max}PayloadFunction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

2010-08-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904313#action_12904313
 ] 

Simon Willnauer commented on LUCENE-2590:
-

bq. Though, can't BS and BS2 just call super.visitSubScorers first, and then 
visit their subs? (Ie right now they dup super's code right?).
Nah, good point mike :) I missed that, nice code reuse though! - will fix that 
soon.

> Enable access to the freq information in a Query's sub-scorers
> --
>
> Key: LUCENE-2590
> URL: https://issues.apache.org/jira/browse/LUCENE-2590
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, 
> LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2611:


Attachment: LUCENE-2611_test.patch

Steven, I made it further and cleared up most fails. 
I've got 2 test fails now total, probably some statics or sysprops somewhere.

I turned of test forking in the patch to make these easier to find from 'ant'


> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch, 
> LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Issues while connecting PyLucene code to Apache WSGI interface

2010-08-30 Thread TJ Ninneman
I've been running PyLucene within the Pylons framework under mod_wsgi for 
almost 2 years without any problems.  

I call the initVM within my .wsgi file:

import lucene

from paste.deploy import loadapp

lucene.initVM(classpath=lucene.CLASSPATH, maxheap="512m")
lucene.getVMEnv().attachCurrentThread()

application = loadapp('config:/usr/local/www/myapp/trunk/apache.ini')

And in my base controller I call attachCurrentThread on each request:

 def __before__(self):
# Bind to JavaVM
lucene.getVMEnv().attachCurrentThread()

I'm not sure how this compares to how it would be done in Django but it sure is 
flawless in my threaded Pylons setup.

TJ





[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904299#action_12904299
 ] 

Robert Muir commented on LUCENE-2611:
-

bq. Sounds good. Thanks for your help.

No prob, no promises i can get that working, but ideally solr tests would run 
without forking, like lucene tests.
when we did this for lucene it cut the time down significantly... i can just 
turn on forkMode=perBatch and see the issues you see I think.


> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Issues while connecting PyLucene code to Apache WSGI interface

2010-08-30 Thread Roman Chyla
I recently had problem with this:
http://stackoverflow.com/questions/548493/jcc-initvm-doesnt-return-when-mod-wsgi-is-configured-as-daemon-mode

you may want to check that too

roman

On Mon, Aug 30, 2010 at 8:50 PM, Andi Vajda  wrote:
>
>
> On Mon, 30 Aug 2010, technology inspired wrote:
>
>> Thanks for the reply. My example runs fine when it runs alone (pure
>> python).
>> Here is the code:
>
> Ok, then the next step is to port it to a python http server such as [1] so
> that you get the threading and initialization story straight:
>  - initVM() must be called from the main thread, once
>  - any thread created from Python must call attachCurrentThread() before
>    making any other calls that involve the JVM
> I'm not sure how this is done in the apache2/wsgi environment, that is a
> question for another forum. That being said, if you solve this problem,
> posting your answer here would be helpful as this has come up before.
>
> About the errors you're reporting, what you're seeing in your browser is
> irrelevant. Instead, you must log errors that happen on the Python side and
> look for these stacktraces there.
>
> Andi..
>
> [1] http://docs.python.org/library/simplehttpserver.html
>
>
>>
>> #import sys, os
>> #sys.path.append("/home/v/workspace/example-project/src/trunk")
>> #os.environ['DJANGO_SETTINGS_MODULE'] = 'example.settings'
>> from lucene import Field, Document, initVM, NIOFSDirectory, IndexWriter,
>> StandardAnalyzer, Version, File
>> from lucene import SimpleFSLockFactory, NumericField, IndexSearcher,
>> QueryParser, NumericRangeQuery
>> from lucene import Integer, BooleanQuery, BooleanClause
>> #from django.shortcuts import render_to_response
>> def build():
>>     initVM()
>>     dir = NIOFSDirectory(File("/home/v/index"), SimpleFSLockFactory())
>>     analyzer = StandardAnalyzer(Version.LUCENE_30)
>>     writer = IndexWriter(dir, analyzer, True,
>> IndexWriter.MaxFieldLength(1024))
>>
>>     field_rows = FieldDoc.objects.all() # Currently there is only one row
>> in
>> database
>>     for row in field_rows:
>>     doc = Document()
>>     if row.category != "":
>>     doc.add(Field('category', row.category, Field.Store.YES,
>> Field.Index.NOT_ANALYZED))
>>         writer.addDocument(doc)
>>
>>     writer.close()
>>     #return render_to_response("index.html", {"var": "Success"})
>>
>> But when I connect it with httpd/mod_wsgi, I see the "Success" page some
>> times and other times, it says "Internal Server Error" with the errors as
>> mentioned in previous email. I am not aware what is the best practice to
>> run
>> Python Lucene code from a web server.
>>
>> You have mentioned about using attachCurrentThread(). I tried using it
>> this
>> way:
>> env = initVM()
>> env.attachCurrentThread()
>>
>> but no change in the response. I don't know if this is how
>> attachCurrentThread() should be used in above build function. Please guide
>> how to connect Lucene code with Apache2/wsgi. My apache2/wsgi is
>> configured
>> properly as I can run non lucene coded web pages. Apache2 is using
>> mpm-worker, a threaded environment.
>>
>> Thanks.
>>
>> Regards,
>> Vin
>>
>>
>>
>> On Sun, Aug 29, 2010 at 12:21 PM, Andi Vajda  wrote:
>>
>>      On Sun, 29 Aug 2010, technology inspired wrote:
>>
>>            I am using PyLucene 3.0.2 on Ubuntu 10.04 with
>>            Python 2.6.5 and Sun Java
>>            1.6. I am written an example script to build index
>>            and store in a directory.
>>            Later on, I want it to search in my next example
>>            script which as of now I
>>            haven't written.
>>
>>            There are two issues I have to mention and looking
>>            for your help:
>>
>>            ISSUE 1:
>>            I am using Apache2 with mod_wsgi 3.3. I have got the
>>            index building script
>>            connected to a GET request. When I call that GET
>>            request, I get following
>>            errors:
>>
>>            [error] [client 127.0.0.1] Premature end of script
>>            headers: wsgi
>>            [notice] child pid exit signal Aborted (6).
>>
>>            With this error, I see "Internal Server Error" on my
>>            browser screen. This
>>            error appears only if I make GET request very often,
>>            i.e. around 1 per 2
>>            seconds. If I issue GET at the interval of 10
>>            seconds, I don't see these
>>            errors.
>>
>>            ISSUE 2:
>>            When I index Date field using NumericField, the GET
>>            request gives "Internal
>>            Server Error" on every alternate request. and the
>>            Apache2 log files gets
>>            these errors:
>>            [error] [client 127.0.0.1] Premature end of script
>>            headers: wsgi
>>            [notice] child pid exit signal Segmentation fault
>>            (11)
>>
>>            I am looking for help to solve these problems. I am
>>            running WSGI deamon
>>            m

[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904294#action_12904294
 ] 

Steven Rowe commented on LUCENE-2611:
-

{quote}
I don't think this is the same problem? This is just ensuring the class extends 
SolrTestCaseJ4 and doesnt have a 'nested test', there shouldn't be any others 
with the same problem as this test.

Then we can separately address your solr.solr.home problem within the base 
classes as you suggest, this is a separate problem I think.
{quote}

Sounds good.  Thanks for your help.

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904292#action_12904292
 ] 

Robert Muir commented on LUCENE-2611:
-

bq. There are other tests in Solr that have the same problem (solr.solr.home 
cross-test environment pollution)

I don't think this is the same problem? This is just ensuring the class extends 
SolrTestCaseJ4 and doesnt have a 'nested test', there shouldn't be any others 
with the same problem as this test.

Then we can separately address your solr.solr.home problem within the base 
classes as you suggest, this is a separate problem I think.


> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904288#action_12904288
 ] 

Steven Rowe commented on LUCENE-2611:
-

There are other tests in Solr that have the same problem ({{solr.solr.home}} 
cross-test environment pollution), and it would be nice if it were possible to 
just touch SolrTestCaseJ4, rather than each problematic Solr test; I was just 
asking if you had found some issue that disallowed this approach.

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904286#action_12904286
 ] 

Robert Muir commented on LUCENE-2611:
-

bq. Although your patch switches the test so that it extends SolrTestCaseJ4, 
you didn't make these changes in SolrTestCaseJ4 itself, so that each test 
doesn't have to individually host these kinds of changes - is there some reason 
for that?

I didnt do anything except fix the test-within-a-test part of it...

previously this class extended TestCase, but inside it was an inner class that 
extended SolrTestCase [yet didnt have any tests].
that was the cause of your problem I think, the inner class just extends Object 
in this patch and the outer extends SolrTestCaseJ4.


> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Issues while connecting PyLucene code to Apache WSGI interface

2010-08-30 Thread Andi Vajda



On Mon, 30 Aug 2010, technology inspired wrote:


Thanks for the reply. My example runs fine when it runs alone (pure python).
Here is the code:


Ok, then the next step is to port it to a python http server such as [1] so 
that you get the threading and initialization story straight:

  - initVM() must be called from the main thread, once
  - any thread created from Python must call attachCurrentThread() before
making any other calls that involve the JVM
I'm not sure how this is done in the apache2/wsgi environment, that is a 
question for another forum. That being said, if you solve this problem, 
posting your answer here would be helpful as this has come up before.


About the errors you're reporting, what you're seeing in your browser is 
irrelevant. Instead, you must log errors that happen on the Python side and 
look for these stacktraces there.


Andi..

[1] http://docs.python.org/library/simplehttpserver.html




#import sys, os
#sys.path.append("/home/v/workspace/example-project/src/trunk")
#os.environ['DJANGO_SETTINGS_MODULE'] = 'example.settings'
from lucene import Field, Document, initVM, NIOFSDirectory, IndexWriter,
StandardAnalyzer, Version, File
from lucene import SimpleFSLockFactory, NumericField, IndexSearcher,
QueryParser, NumericRangeQuery
from lucene import Integer, BooleanQuery, BooleanClause
#from django.shortcuts import render_to_response
def build():
    initVM()
    dir = NIOFSDirectory(File("/home/v/index"), SimpleFSLockFactory())
    analyzer = StandardAnalyzer(Version.LUCENE_30)
    writer = IndexWriter(dir, analyzer, True,
IndexWriter.MaxFieldLength(1024))

    field_rows = FieldDoc.objects.all() # Currently there is only one row in
database
    for row in field_rows:
    doc = Document()
    if row.category != "":
    doc.add(Field('category', row.category, Field.Store.YES,
Field.Index.NOT_ANALYZED))
        writer.addDocument(doc)

    writer.close()
    #return render_to_response("index.html", {"var": "Success"})

But when I connect it with httpd/mod_wsgi, I see the "Success" page some
times and other times, it says "Internal Server Error" with the errors as
mentioned in previous email. I am not aware what is the best practice to run
Python Lucene code from a web server.

You have mentioned about using attachCurrentThread(). I tried using it this
way:
env = initVM()
env.attachCurrentThread()

but no change in the response. I don't know if this is how
attachCurrentThread() should be used in above build function. Please guide
how to connect Lucene code with Apache2/wsgi. My apache2/wsgi is configured
properly as I can run non lucene coded web pages. Apache2 is using
mpm-worker, a threaded environment.

Thanks.

Regards,
Vin



On Sun, Aug 29, 2010 at 12:21 PM, Andi Vajda  wrote:

  On Sun, 29 Aug 2010, technology inspired wrote:

I am using PyLucene 3.0.2 on Ubuntu 10.04 with
Python 2.6.5 and Sun Java
1.6. I am written an example script to build index
and store in a directory.
Later on, I want it to search in my next example
script which as of now I
haven't written.

There are two issues I have to mention and looking
for your help:

ISSUE 1:
I am using Apache2 with mod_wsgi 3.3. I have got the
index building script
connected to a GET request. When I call that GET
request, I get following
errors:

[error] [client 127.0.0.1] Premature end of script
headers: wsgi
[notice] child pid exit signal Aborted (6).

With this error, I see "Internal Server Error" on my
browser screen. This
error appears only if I make GET request very often,
i.e. around 1 per 2
seconds. If I issue GET at the interval of 10
seconds, I don't see these
errors.

ISSUE 2:
When I index Date field using NumericField, the GET
request gives "Internal
Server Error" on every alternate request. and the
Apache2 log files gets
these errors:
[error] [client 127.0.0.1] Premature end of script
headers: wsgi
[notice] child pid exit signal Segmentation fault
(11)

I am looking for help to solve these problems. I am
running WSGI deamon
mode. WSGI settings are:
...
WSGIDaemonProcess example.com user=www-data
group-www-data thread 25
WSGIProcessGroup example.com
WSGIScriptAlias /
/home/user1/workspace/http_wsgi/wsgi
...

So do guide how to enable PyLucene based codes
running from Apache2 mod_wsgi
(searching, indexing etc).


First, get your application to work outside of apache2/wsgi, as a
plain Python program. Then, once it's debug

[jira] Commented: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904279#action_12904279
 ] 

Steven Rowe commented on LUCENE-2611:
-

bq. Steven, here's a fix for that test. I think it should resolve the problem 
in your IDE 

Cool, thanks, I'll test it out tonight.

Although your patch switches the test so that it extends SolrTestCaseJ4, you 
didn't make these changes in SolrTestCaseJ4 itself, so that each test doesn't 
have to individually host these kinds of changes - is there some reason for 
that?

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type

2010-08-30 Thread Chris Tucker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904275#action_12904275
 ] 

Chris Tucker commented on SOLR-1123:


I'd like to +1 the short-term fix.  The incorrect content type makes it 
difficult to filter/transform the response in a servlet filter or Jetty 
handler: one has to inspect the wt parameter on the request to establish 
(guess?) that JSON has been requested and is being sent back.

> Change the JSONResponseWriter content type
> --
>
> Key: SOLR-1123
> URL: https://issues.apache.org/jira/browse/SOLR-1123
> Project: Solr
>  Issue Type: Improvement
>Reporter: Uri Boness
> Fix For: Next
>
> Attachments: JSON_contentType_incl_tests.patch
>
>
> Currently the jSON content type is not used. Instead the palin/text content 
> type is used. The reason for this as I understand is to enable viewing the 
> json response as as text in the browser. While this is valid argument, I do 
> believe that there should at least be an option to configure this writer to 
> use the JSON content type. According to 
> [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to 
> be application/json (and not text/x-json). The reason this can be very 
> helpful is that today you have plugins for browsers (e.g. 
> [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page 
> with application/json content type in a user friendly manner (just like xml 
> is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2611:


Attachment: LUCENE-2611_test.patch

Steven, here's a fix for that test. I think it should resolve the problem in 
your IDE

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch, LUCENE-2611_test.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-08-30 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904264#action_12904264
 ] 

Grant Ingersoll commented on SOLR-2010:
---

Hi James,

First off, good work.  I like the overall design, etc.

Second, this patch no longer applies cleanly to trunk.  The issue is in the 
SearchHandler.

Third, in thinking some more about the whole distributed case, perhaps we are 
approaching this wrong.  I was originally thinking that we would have to go off 
and re-query all the shards (as in send another message) but we really 
shouldn't have to do that, right?  Why can't we just pass the collation request 
through to the shards as part of the get suggestions and then it can, if 
collation is asked for, return it's collation suggestions.  Then, the question 
becomes how to merge the suggestions and pick the best one.  This should save a 
round trip at the cost of doing some extra collations, but since most people 
aren't going to ask for more than 5 or 10, it shouldn't be an issue.

-Grant

> Improvements to SpellCheckComponent Collate functionality
> -
>
> Key: SOLR-2010
> URL: https://issues.apache.org/jira/browse/SOLR-2010
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java, spellchecker
>Affects Versions: 1.4.1
> Environment: Tested against trunk revision 966633
>Reporter: James Dyer
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch, 
> SOLR-2010.patch, SOLR-2010.txt
>
>
> Improvements to SpellCheckComponent Collate functionality
> Our project requires a better Spell Check Collator.  I'm contributing this as 
> a patch to get suggestions for improvements and in case there is a broader 
> need for these features.
> 1. Only return collations that are guaranteed to result in hits if re-queried 
> (applying original fq params also).  This is especially helpful when there is 
> more than one correction per query.  The 1.4 behavior does not verify that a 
> particular combination will actually return hits.
> 2. Provide the option to get multiple collation suggestions
> 3. Provide extended collation results including the # of hits re-querying 
> will return and a breakdown of each misspelled word and its correction.
> This patch is similar to what is described in SOLR-507 item #1.  Also, this 
> patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
> dictionary could be created that combines the terms from the multiple fields. 
>  The collator then would prune out any spurious suggestions this would cause.
> This patch adds the following spellcheck parameters:
> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
> before giving up.  Lower values ensure better performance.  Higher values may 
> be necessary to find a collation that can return results.  Default is 0, 
> which maintains backwards-compatible behavior (do not check collations).
> 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
> 1, which maintains backwards-compatible behavior.
> 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
> format detailing collations found.  default is false, which maintains 
> backwards-compatible behavior.  When true, output is like this (in context):
> 
>   
>   
>   94
>   7
>   11
>   
>   hope
>   how
>   hope
>   chops
>   hoped
>   etc
>   
>   
>   100
>   16
>   21
>   
>   fall
>   fails
>   fail
>   fill
>   faith
>   all
>   etc
>   
>   
>   
>   Title:(how AND fails)
>   2
>   
>   how
>   fails
>   
>   
>   
>   Title:(hope AND faith)
>   2
>   
>   hope
>   faith
>   
>   
>   
>   Title:(chops AND all)
>   1
>   
>   chops
>   

[jira] Commented: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

2010-08-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904250#action_12904250
 ] 

Michael McCandless commented on LUCENE-2590:


Looks great!

Though, can't BS and BS2 just call super.visitSubScorers first, and then visit 
their subs?  (Ie right now they dup super's code right?).

> Enable access to the freq information in a Query's sub-scorers
> --
>
> Key: LUCENE-2590
> URL: https://issues.apache.org/jira/browse/LUCENE-2590
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, 
> LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2600) don't try to cache a composite reader's MultiBits deletedDocs

2010-08-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2600.


Resolution: Fixed

> don't try to cache a composite reader's MultiBits deletedDocs
> -
>
> Key: LUCENE-2600
> URL: https://issues.apache.org/jira/browse/LUCENE-2600
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-2600.patch, LUCENE-2600.patch
>
>
> MultiFields.getDeletedDocs now builds up a MultiBits instance (so that one 
> can check if a top-level docID is deleted), but it now stuffs it into a 
> private cache on IndexReader.
> This is invalid when the composite reader is read/write, and can result in a 
> MultiReader falsely claiming a doc was not deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2611:


Fix Version/s: 3.1
Affects Version/s: 3.1

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 3.1, 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Closed: (LUCENE-2604) add regexpquery to queryparser

2010-08-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer closed LUCENE-2604.
---

Fix Version/s: 4.0
   Resolution: Fixed

 Committed revision 990836.


> add regexpquery to queryparser
> --
>
> Key: LUCENE-2604
> URL: https://issues.apache.org/jira/browse/LUCENE-2604
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: QueryParser
>Affects Versions: 4.0
>Reporter: Robert Muir
>Assignee: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-2604.patch, LUCENE-2604.patch, LUCENE-2604.patch
>
>
> patch that adds RegexpQuery if you /enter an expression between slashes like 
> this/
> i didnt do the contrib ones but could add it there too if it seems like a 
> good idea.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1237) Developers Resources Documentation

2010-08-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved LUCENE-1237.
-

Resolution: Incomplete

> Developers Resources Documentation
> --
>
> Key: LUCENE-1237
> URL: https://issues.apache.org/jira/browse/LUCENE-1237
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Website
>Reporter: Grant Ingersoll
>Priority: Trivial
>
> Some of the links on the developer resources page are broken.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (SOLR-1819) Upgrade to Tika 0.7

2010-08-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-1819.
---

Fix Version/s: 1.4.2
   3.1
   4.0
   (was: Next)
   Resolution: Fixed

> Upgrade to Tika 0.7
> ---
>
> Key: SOLR-1819
> URL: https://issues.apache.org/jira/browse/SOLR-1819
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tricia Williams
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.4.2, 3.1, 4.0
>
>
> See title.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2608) Allow for specification of spell checker accuracy when calling suggestSimilar

2010-08-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved LUCENE-2608.
-

Fix Version/s: 3.1
   4.0
   Resolution: Fixed

> Allow for specification of spell checker accuracy when calling suggestSimilar
> -
>
> Key: LUCENE-2608
> URL: https://issues.apache.org/jira/browse/LUCENE-2608
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/spellchecker
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2608-3x.patch, LUCENE-2608.patch
>
>
> There is really no need for accuracy to be a class variable in the 
> Spellchecker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Closed: (SOLR-1429) Upgrade Solr dependencies if new versions are available

2010-08-30 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll closed SOLR-1429.
-

Resolution: Won't Fix

Just handle this on an as-needed basis.

> Upgrade Solr dependencies if new versions are available
> ---
>
> Key: SOLR-1429
> URL: https://issues.apache.org/jira/browse/SOLR-1429
> Project: Solr
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: Next
>
>
> SLF4J and other dependencies have new releases available.  Test them out and 
> upgrade when/where appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2611) IntelliJ IDEA setup

2010-08-30 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2611:


Attachment: LUCENE-2611.patch

Patch for trunk that adds Ant build integration.

> IntelliJ IDEA setup
> ---
>
> Key: LUCENE-2611
> URL: https://issues.apache.org/jira/browse/LUCENE-2611
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Build
>Affects Versions: 4.0
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-2611-branch-3x.patch, LUCENE-2611.patch, 
> LUCENE-2611.patch, LUCENE-2611.patch
>
>
> Setting up Lucene/Solr in IntelliJ IDEA can be time-consuming.
> The attached patch adds a new top level directory {{dev-tools/}} with sub-dir 
> {{idea/}} containing basic setup files for trunk, as well as a top-level ant 
> target named "idea" that copies these files into the proper locations.  This 
> arrangement avoids the messiness attendant to in-place project configuration 
> files directly checked into source control.
> The IDEA configuration includes modules for Lucene and Solr, each Lucene and 
> Solr contrib, and each analysis module.  A JUnit test run per module is 
> included.
> Once {{ant idea}} has been run, the only configuration that must be performed 
> manually is configuring the project-level JDK.
> If this patch is committed, Subversion svn:ignore properties should be 
> added/modified to ignore the destination module files (*.iml) in each 
> module's directory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2590) Enable access to the freq information in a Query's sub-scorers

2010-08-30 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2590:


Attachment: LUCENE-2590.patch

Another iteration with Weight as a protected member or Scorer. All scorers I 
looked at had the weight already as a member so this change makes things way 
simpler though. I think this is close to commit.

> Enable access to the freq information in a Query's sub-scorers
> --
>
> Key: LUCENE-2590
> URL: https://issues.apache.org/jira/browse/LUCENE-2590
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Attachments: LUCENE-2590.patch, LUCENE-2590.patch, LUCENE-2590.patch, 
> LUCENE-2590.patch, LUCENE-2590.patch
>
>
> The ability to gather more details than just the score, of how a given
> doc matches the current query, has come up a number of times on the
> user's lists.  (most recently in the thread "Query Match Count" by
> Ryan McV on java-user).
> EG if you have a simple TermQuery "foo", on each hit you'd like to
> know how many times "foo" occurred in that doc; or a BooleanQuery +foo
> +bar, being able to separately see the freq of foo and bar for the
> current hit.
> Lucene doesn't make this possible today, which is a shame because
> Lucene in fact does compute exactly this information; it's just not
> accessible from the Collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1940) SolrDispatchFilter sets content type as NULL

2010-08-30 Thread Sven Hoffmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904195#action_12904195
 ] 

Sven Hoffmann commented on SOLR-1940:
-

After applying the patch of src/webapp/web/admin/index.jsp on WebSphere 
Application Server 6.1.0.29 I was able to open the 'Config' and 'Schema' pages.

> SolrDispatchFilter sets content type as NULL
> 
>
> Key: SOLR-1940
> URL: https://issues.apache.org/jira/browse/SOLR-1940
> Project: Solr
>  Issue Type: Bug
>Reporter: Lance Norskog
> Attachments: SOLR-1940.patch, SOLR-1940.patch
>
>
> o.a.s.h.SolrDispatchFilter can set the output Content-Type to a null pointer 
> instead of a string. Under websphere this results in a NullPointerException. 
> This is the offending code: 
> response.setContentType(responseWriter.getContentType(solrReq, solrRsp));
> Suggested fix: either use a default content type, or do not set the content 
> type and let the browser handle it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2010-08-30 Thread Tamas Sandor (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904184#action_12904184
 ] 

Tamas Sandor commented on SOLR-773:
---

I'm also interested in the status of Polygon search...

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: Next
>
> Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
> lucene.tar.gz, screenshot-1.jpg, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, 
> solrGeoQuery.tar, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904154#action_12904154
 ] 

Robert Muir commented on LUCENE-2628:
-

bq. Should be pretty strong reasons to break things out into modules. I don't 
want to have to piece together 15 tiny jars every time i write a some simple 
lucene code.

I agree. an example that might fit this case would be queryparsing: 
* there is a mix of functionality strewn across lucene/solr. 
* most of them are 99% the same and differ only in one small piece
* its a back compat nightmare (given javacc grammars etc)
* functionality and consistency lags behind, maybe due to the above: e.g. lack 
of span support in most

for something like that, i wouldnt mind an additional jar if it was all cleaned 
up and organized into a module, where things like Version could be dropped
and it could evolve naturally: real versions and you just keep using the old 
jar if you want exact behavior.



> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904147#action_12904147
 ] 

Mark Miller commented on LUCENE-2628:
-

Making modules for modules sake is not the right path I think. Some modules 
make sense and will be helpful - but modules in general are a pain in the ass 
for devs compared to one jar. Should be pretty strong reasons to break things 
out into modules. I don't want to have to piece together 15 tiny jars every 
time i write a some simple lucene code.

> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904142#action_12904142
 ] 

Simon Willnauer commented on LUCENE-2628:
-

I agree with Robert and Shai on this issues and that other folks should just 
pull out the class they need. Adding it to commons makes sense and they should 
likely go that way. Still we should try to move the modularization forward as 
we did with analyzers etc. 
Yet making a module out of almost 100% lucene.internal API doesn't make sense 
and would bring lots of disadvantages as folk have outlined above. 

bq. But I'll fight this fight in another issue when I propose such a module 
Once it is beneficial for Lucene I am with you! I was only one more option 
which came to my mind.

That said, I suggest to close this issue and move forward. If nobody objects I 
am going to do that by the end of the day.

> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2093) regular expression in PatternReplaceFilter can handle: /([^/]*)

2010-08-30 Thread Kuri Masta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904140#action_12904140
 ] 

Kuri Masta commented on SOLR-2093:
--

With a/b/c as input

You'll notice that I start searching from the end of the line.
1(a$).match everything to the left until: /
2(/).   match /
3($1 = b). Repeat the previous but capture the match
4.(/)   match /

I wouldn't even know how to write regexp so it will concatenate two seperate 
matches, divided by '/', into one var.

Before I posted I've tried two regexp tools besides Solr.

I would like you to try again. But please keep in mind that I don't need this 
fix, I just found a bug and am reporting it.

> regular expression in PatternReplaceFilter can handle: /([^/]*)
> ---
>
> Key: SOLR-2093
> URL: https://issues.apache.org/jira/browse/SOLR-2093
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
> Environment: debian,JRE1.6,solr1.4
>Reporter: Kuri Masta
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Using PatternReplaceFilter i want to extract a certain word out of the URI.
> Although I now understand that I should handle this outside of Solr, the fact 
> remains that Solr does not adequately handle regular expressions.
> Viewing the source code, I don't see any problems since it uses the java 
> library.
> The problem:
>   
> 
>  pattern="/([^/]*)/[^/]*$" replacement="$1"  
> replace="all" />
>   
> Input text:
> - a/b/c
> Expected
> - b
> Result Solr
> - ab
> An online JAVA regexp tester (http://www.regexplanet.com/simple/index.html):
> - b
> So the problem area lies at /([^/])

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2628) Extract OpenBitSet to Apache Commons

2010-08-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904132#action_12904132
 ] 

Robert Muir commented on LUCENE-2628:
-

Can you give a concrete example how a "utility jar" would be useful?

I didn't think so.

> Extract OpenBitSet to Apache Commons
> 
>
> Key: LUCENE-2628
> URL: https://issues.apache.org/jira/browse/LUCENE-2628
> Project: Lucene - Java
>  Issue Type: Wish
>Reporter: Stu Hood
>
> o.a.l.util.OpenBitSet is a great alternative to java.util.BitSet, and it is 
> generally useful outside of the search field. It would be great if OpenBitSet 
> were available outside of Lucene proper, perhaps as part of Apache Commons.
> Aside from the communication required to accomplish this, there is the small 
> issue of OpenBitSet extending o.a.l.search.DocIdSet in Lucene 3.0. There is 
> very little logic contained in DocIdSet, so it could probably become an 
> interface: Lucene proper could then extend the extract version of OpenBitSet 
> to implement DocIdSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1344) Make the Lucene jar an OSGi bundle

2010-08-30 Thread Enrico Schnepel (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904112#action_12904112
 ] 

Enrico Schnepel commented on LUCENE-1344:
-

Is there any progress on this issue?

> Make the Lucene jar an OSGi bundle
> --
>
> Key: LUCENE-1344
> URL: https://issues.apache.org/jira/browse/LUCENE-1344
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Build
>Reporter: Nicolas Lalevée
>Priority: Minor
> Attachments: LUCENE-1344-r679133.patch, LUCENE-1344-r690675.patch, 
> LUCENE-1344-r690691.patch, LUCENE-1344-r696747.patch, LUCENE-1344.patch, 
> LUCENE-1344.patch, LUCENE-1344.patch, LUCENE-1344.patch, LUCENE-1344.patch, 
> LUCENE-1344.patch, MANIFEST.MF.diff
>
>
> In order to use Lucene in an OSGi environment, some additional headers are 
> needed in the manifest of the jar. As Lucene has no dependency, it is pretty 
> straight forward and it ill be easy to maintain I think.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2013) ASCIIFoldingFilter => MappingCharFilterFactory as a mapping file

2010-08-30 Thread Koji Sekiguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904092#action_12904092
 ] 

Koji Sekiguchi commented on SOLR-2013:
--

I think this is ready to go. Any objections?

> ASCIIFoldingFilter => MappingCharFilterFactory as a mapping file
> 
>
> Key: SOLR-2013
> URL: https://issues.apache.org/jira/browse/SOLR-2013
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.1, Next
>Reporter: Steven Rowe
>Priority: Minor
> Fix For: 3.1, Next
>
> Attachments: mapping-FoldToASCII.txt, mapping-FoldToASCII.txt
>
>
> Attached is a mapping file to provide the equivalent of ASCIIFoldingFilter 
> through the MappingCharFilterFactory.
> I'm not sure where this should go in the source tree.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org