date:20110721

[jira] [Commented] (SOLR-2382) DIH Cache Improvements

2011-07-21 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069414#comment-13069414
 ] 

Noble Paul commented on SOLR-2382:
--

SOLR-2382-solrwriter.patch is committed. see if SOLR-2382-entities.patch needs 
an update 

> DIH Cache Improvements
> --
>
> Key: SOLR-2382
> URL: https://issues.apache.org/jira/browse/SOLR-2382
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Reporter: James Dyer
>Priority: Minor
> Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
> SOLR-2382-properties.patch, SOLR-2382-properties.patch, 
> SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, 
> SOLR-2382-solrwriter.patch, SOLR-2382.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch
>
>
> Functionality:
>  1. Provide a pluggable caching framework for DIH so that users can choose a 
> cache implementation that best suits their data and application.
>  
>  2. Provide a means to temporarily cache a child Entity's data without 
> needing to create a special cached implementation of the Entity Processor 
> (such as CachedSqlEntityProcessor).
>  
>  3. Provide a means to write the final (root entity) DIH output to a cache 
> rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
> cache as an Entity input.  Also provide the ability to do delta updates on 
> such persistent caches.
>  
>  4. Provide the ability to partition data across multiple caches that can 
> then be fed back into DIH and indexed either to varying Solr Shards, or to 
> the same Core in parallel.
> Use Cases:
>  1. We needed a flexible & scalable way to temporarily cache child-entity 
> data prior to joining to parent entities.
>   - Using SqlEntityProcessor with Child Entities can cause an "n+1 select" 
> problem.
>   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
> mechanism and does not scale.
>   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
>  
>  2. We needed the ability to gather data from long-running entities by a 
> process that runs separate from our main indexing process.
>   
>  3. We wanted the ability to do a delta import of only the entities that 
> changed.
>   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
> few fields changed.
>   - Our data comes from 50+ complex sql queries and/or flat files.
>   - We do not want to incur overhead re-gathering all of this data if only 1 
> entity's data changed.
>   - Persistent DIH caches solve this problem.
>   
>  4. We want the ability to index several documents in parallel (using 1.4.1, 
> which did not have the "threads" parameter).
>  
>  5. In the future, we may need to use Shards, creating a need to easily 
> partition our source data into Shards.
> Implementation Details:
>  1. De-couple EntityProcessorBase from caching.  
>   - Created a new interface, DIHCache & two implementations:  
> - SortedMapBackedCache - An in-memory cache, used as default with 
> CachedSqlEntityProcessor (now deprecated).
> - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
> with je-4.1.6.jar
>- NOTE: the existing Lucene Contrib "db" project uses je-3.3.93.jar.  
> I believe this may be incompatible due to Generic Usage.
>- NOTE: I did not modify the ant script to automatically get this jar, 
> so to use or evaluate this patch, download bdb-je from 
> http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
>  
>  2. Allow Entity Processors to take a "cacheImpl" parameter to cause the 
> entity data to be cached (see EntityProcessorBase & DIHCacheProperties).
>  
>  3. Partially De-couple SolrWriter from DocBuilder
>   - Created a new interface DIHWriter, & two implementations:
>- SolrWriter (refactored)
>- DIHCacheWriter (allows DIH to write ultimately to a Cache).
>
>  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
> persistent Cache as DIH Entity Input.
>  
>  5. Support a "partition" parameter with both DIHCacheWriter and 
> DIHCacheProcessor to allow for easy partitioning of source entity data.
>  
>  6. Change the semantics of entity.destroy()
>   - Previously, it was being called on each iteration of 
> DocBuilder.buildDocument().
>   - Now it is does one-time cleanup tasks (like closing or deleting a 
> disk-backed cache) once the entity processor is completed.
>   - The only out-of-the-box entity processor that previously implemented 
> destroy() was LineEntitiyProcessor, so this is not a very invasive change.
> General Notes:
> We are near completion in converting our search functionality from a legacy 
> search engine to Solr.  However, I found that DIH di

[jira] [Assigned] (SOLR-2668) DIH - multithreaded DocBuilder ignores onError Attribute

2011-07-21 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-2668:
---

Assignee: Shalin Shekhar Mangar

> DIH - multithreaded DocBuilder ignores onError Attribute
> 
>
> Key: SOLR-2668
> URL: https://issues.apache.org/jira/browse/SOLR-2668
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 3.3
>Reporter: Frank Wesemann
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-2668.patch
>
>
> If the EntityProcessor of a subentity throws an Exception in its init() 
> Method, DocBuilder ignores onError=continue or skip attributes on the parent 
> entity. DocBuilder stops immediately and logs "Import completed successfully".
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2659) src/test-files/ should be moved under src/test-files// for all Solr modules except core

2011-07-21 Thread Steven Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated SOLR-2659:
--

Description: 
SOLR-2452 split the solrj & common tests and test-files out from under 
{{solr/src/test\{,-files\}}} and placed them under {{solr/solrj/}}.

Because IntelliJ's dependency scheme can't directly support the dependencies 
among the {{core/}}, {{solrj/}}, and {{test-framework/}} internal modules, 
IntelliJ runs {{core/}} and {{solrj/}} tests under the monolithic IntelliJ 
"solr" module, 

As a result, when IntelliJ copies {{core/src/test-files/\*\*}} and 
{{solrj/src/test-files/\*\*}} to {{solr/build/solr-idea/classes/test/}} (the 
test output directory), only one file from each same-named file pair can reside 
in the target directory, e.g. {{solr/conf/schema.xml}}.  When same-named files 
differ between the two {{test-files/}} directories, tests will fail.  E.g.: 
LUCENE-2048 introduced a {{nopositions}} fieldType and a {{nopositionstext}} 
field into {{core/src/test-files/solr/conf/schema.xml}}, but not into the 
same-named file under {{solrj/src/test-files/}}, so when IntelliJ chooses the 
solrj version when copying resources, the core test that depends on the 
{{nopositionstext}} field ({{TestOmitPositions}}) will fail.

I propose adding an extra directory level under {{solrj/src/test-files/}}: 
{{solrj/src/test-files/solrj/}}.  That way, files from {{core/src/test-files/}} 
can have the same names, but still co-exist when copied to the test output 
directory by IntelliJ.

To maintain consistency, as well as avoid future naming conflicts, all other 
solr modules except core should switch to the same layout: 
{{src/test-files//\*}}.  Currently all contribs' solr homes are 
named {{src/test-files/solr-/}} - these directories should be 
renamed to {{src/test-files//solr}}.

  was:
SOLR-2452 split the solrj & common tests and test-files out from under 
{{solr/src/test\{,-files\}}} and placed them under {{solr/solrj/}}.

Because IntelliJ's dependency scheme can't directly support the dependencies 
among the {{core/}}, {{solrj/}}, and {{test-framework/}} internal modules, 
IntelliJ runs {{core/}} and {{solrj/}} tests under the monolithic IntelliJ 
"solr" module, 

As a result, when IntelliJ copies {{core/src/test-files/\*\*}} and 
{{solrj/src/test-files/\*\*}} to {{solr/build/solr-idea/classes/test/}} (the 
test output directory), only one file from each same-named file pair can reside 
in the target directory, e.g. {{solr/conf/schema.xml}}.  When same-named files 
differ between the two {{test-files/}} directories, tests will fail.  E.g.: 
LUCENE-2048 introduced a {{nopositions}} fieldType and a {{nopositionstext}} 
field into {{core/src/test-files/solr/conf/schema.xml}}, but not into the 
same-named file under {{solrj/src/test-files/}}, so when IntelliJ chooses the 
solrj version when copying resources, the core test that depends on the 
{{nopositionstext}} field ({{TestOmitPositions}}) will fail.

I propose adding an extra directory level under {{solrj/src/test-files/}}: 
{{solrj/src/test-files/solrj/}}.  That way, files from {{core/src/test-files/}} 
can have the same names, but still co-exist when copied to the test output 
directory by IntelliJ.

Summary: src/test-files/** should be moved under 
src/test-files//** for all Solr modules except core  (was: 
IntelliJ resource copying of solrj/src/test-files/** and core/src/test-files/** 
to build output directory has to choose between/overwrite same-named files)

> src/test-files/** should be moved under src/test-files//** for 
> all Solr modules except core
> 
>
> Key: SOLR-2659
> URL: https://issues.apache.org/jira/browse/SOLR-2659
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4, 4.0
>Reporter: Steven Rowe
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: SOLR-2659.patch
>
>
> SOLR-2452 split the solrj & common tests and test-files out from under 
> {{solr/src/test\{,-files\}}} and placed them under {{solr/solrj/}}.
> Because IntelliJ's dependency scheme can't directly support the dependencies 
> among the {{core/}}, {{solrj/}}, and {{test-framework/}} internal modules, 
> IntelliJ runs {{core/}} and {{solrj/}} tests under the monolithic IntelliJ 
> "solr" module, 
> As a result, when IntelliJ copies {{core/src/test-files/\*\*}} and 
> {{solrj/src/test-files/\*\*}} to {{solr/build/solr-idea/classes/test/}} (the 
> test output directory), only one file from each same-named file pair can 
> reside in the target directory, e.g. {{solr/conf/schema.xml}}.  When 
> same-named files differ between the two {{test-files/}} directories, tests 
> will fail.  E.g.: LUCENE-2048 introduced a {{nopositions}} f

[jira] [Commented] (SOLR-2656) realtime get

2011-07-21 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069351#comment-13069351
 ] 

Yonik Seeley commented on SOLR-2656:


Whew, it was just a test bug.  A tripped assert (that I had backwards) doesn't 
trigger a catch(Exception e), so the read threads that decrement the counter 
all exited, leaving the write threads spinning forever.

> realtime get
> 
>
> Key: SOLR-2656
> URL: https://issues.apache.org/jira/browse/SOLR-2656
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-2656.patch, SOLR-2656_test.patch
>
>
> Provide a non point-in-time interface to get a document.
> For example, if you add a new document, you will be able to get it, 
> regardless of if the searcher has been refreshed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2656) realtime get

2011-07-21 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2656:
---

Attachment: SOLR-2656_test.patch

I coded up a test and then factored it out since it's probably even a good test 
even before we get realtime get committed (with the percent of realtime queries 
set to 0).

Bad news is, I'm getting a hang for some reason (just the test w/ straight 
trunk).  Currently looking into it further, but I thought I'd put up the patch 
in the meantime anyway.

> realtime get
> 
>
> Key: SOLR-2656
> URL: https://issues.apache.org/jira/browse/SOLR-2656
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-2656.patch, SOLR-2656_test.patch
>
>
> Provide a non point-in-time interface to get a document.
> For example, if you add a new document, you will be able to get it, 
> regardless of if the searcher has been refreshed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

how to customize lucene index file

2011-07-21 Thread Ari . Ko


Hi, good morning everyone.

I would like to ask one question about lucene index file.

I want to know whether I could customize this index file.

I know this index file is created by lucene in the backgroud from documents
directory.
And all of words and their frequence is output in the index file.

In fact I want to calculate the weight of each word and just output some
words which weight value is high in the index file.

According to the index creation method as below

***
...
IndexWriter writer = new IndexWriter(index, new JapaneseAnalyzer(), create);
...
Document doc = new Document();
doc.add(Field.Text("contents",reader));

...
writer.addDocument(doc);
***

It seems impossible. But I realy want to is there some method to realize it
in the lucene ?

In fact I want to use lucene not for retrival but for text clustering.

Best regards.

Yali Hu




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2668) DIH - multithreaded DocBuilder ignores onError Attribute

2011-07-21 Thread Frank Wesemann (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank Wesemann updated SOLR-2668:
-

Attachment: SOLR-2668.patch

patch for the unittest to clarify the problem

> DIH - multithreaded DocBuilder ignores onError Attribute
> 
>
> Key: SOLR-2668
> URL: https://issues.apache.org/jira/browse/SOLR-2668
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 3.3
>Reporter: Frank Wesemann
> Attachments: SOLR-2668.patch
>
>
> If the EntityProcessor of a subentity throws an Exception in its init() 
> Method, DocBuilder ignores onError=continue or skip attributes on the parent 
> entity. DocBuilder stops immediately and logs "Import completed successfully".
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2668) DIH - multithreaded DocBuilder ignores onError Attribute

2011-07-21 Thread Frank Wesemann (JIRA)

DIH - multithreaded DocBuilder ignores onError Attribute


 Key: SOLR-2668
 URL: https://issues.apache.org/jira/browse/SOLR-2668
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.3
Reporter: Frank Wesemann


If the EntityProcessor of a subentity throws an Exception in its init() Method, 
DocBuilder ignores onError=continue or skip attributes on the parent entity. 
DocBuilder stops immediately and logs "Import completed successfully".
 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3097) Post grouping faceting

2011-07-21 Thread Martijn van Groningen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3097:
--

Attachment: LUCENE-30971.patch

Updated the patch.
* Included the grouping collector into the random test.
* Added more documentation.

I think this collector is ready to be committed. This collector implements the 
second grouping / faceting case that I've described in the issue description.

> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
> LUCENE-30971.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release PyLucene 3.3 (rc3)

2011-07-21 Thread Christian Heimes

Am 21.07.2011 18:47, schrieb Andi Vajda:
> Please vote to release these artifacts as PyLucene 3.3-3.

+1 from me

I've tested PyLucene on Linux (Ubuntu 10.04 as well as 11.04 on X86_64)
with Python 2.6 and 2.7. My bobobrowse integration and the unit test
suite of our applications are working, too. The grouping module is
available, too. :)

Christian

[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2011-07-21 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069215#comment-13069215
 ] 

Hoss Man commented on SOLR-2649:


I believe the intention here was that *if* a query string contains any query 
operators (AND/OR/NOT/+/-) then it's assumed the user wants *exactly* what they 
asked for, and the "mm" value should not be used.

I believe in the cases where {{false==doMinMatched}} then the {{q.op}} (which 
defaults to {{}} should come into play, 
so folks using {{mm=100%&q.op=AND}} or {{mm=0&q.op=OR}} should already get the 
behavior they expect (if it's not using q.op then that definitely seems like a 
bug)

when people are using middle ground values for mm (ie: {{mm=50%}} etc...) then 
it definitely seems like we need some way for them to indicate to edismax thta 
the mm should *always* be used.

> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.3
>Reporter: Magnus Bergmark
>Priority: Minor
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The behavior seems to be intentional, although the reason why is never 
> explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from 
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the 
> primary features of dismax.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-777) SpanWithinQuery - A SpanNotQuery that allows a specified number of intersections

2011-07-21 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-777:
---

Attachment: LUCENE-777-3X.patch

patch for 3x

> SpanWithinQuery - A SpanNotQuery that allows a specified number of 
> intersections
> 
>
> Key: LUCENE-777
> URL: https://issues.apache.org/jira/browse/LUCENE-777
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/search
>Reporter: Mark Miller
>Priority: Minor
> Attachments: LUCENE-777-3X.patch, LUCENE-777.patch, LUCENE-777.patch, 
> SpanWithinQuery.java
>
>
> A SpanNotQuery that allows a specified number of intersections.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2382) DIH Cache Improvements

2011-07-21 Thread James Dyer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2382:
-

Attachment: SOLR-2382-solrwriter.patch

Here is the "solrwriter" patch, sync'ed to the latest trunk, which now has the 
first patch ("properties") committed...

> DIH Cache Improvements
> --
>
> Key: SOLR-2382
> URL: https://issues.apache.org/jira/browse/SOLR-2382
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Reporter: James Dyer
>Priority: Minor
> Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
> SOLR-2382-properties.patch, SOLR-2382-properties.patch, 
> SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, 
> SOLR-2382-solrwriter.patch, SOLR-2382.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch
>
>
> Functionality:
>  1. Provide a pluggable caching framework for DIH so that users can choose a 
> cache implementation that best suits their data and application.
>  
>  2. Provide a means to temporarily cache a child Entity's data without 
> needing to create a special cached implementation of the Entity Processor 
> (such as CachedSqlEntityProcessor).
>  
>  3. Provide a means to write the final (root entity) DIH output to a cache 
> rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
> cache as an Entity input.  Also provide the ability to do delta updates on 
> such persistent caches.
>  
>  4. Provide the ability to partition data across multiple caches that can 
> then be fed back into DIH and indexed either to varying Solr Shards, or to 
> the same Core in parallel.
> Use Cases:
>  1. We needed a flexible & scalable way to temporarily cache child-entity 
> data prior to joining to parent entities.
>   - Using SqlEntityProcessor with Child Entities can cause an "n+1 select" 
> problem.
>   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
> mechanism and does not scale.
>   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
>  
>  2. We needed the ability to gather data from long-running entities by a 
> process that runs separate from our main indexing process.
>   
>  3. We wanted the ability to do a delta import of only the entities that 
> changed.
>   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
> few fields changed.
>   - Our data comes from 50+ complex sql queries and/or flat files.
>   - We do not want to incur overhead re-gathering all of this data if only 1 
> entity's data changed.
>   - Persistent DIH caches solve this problem.
>   
>  4. We want the ability to index several documents in parallel (using 1.4.1, 
> which did not have the "threads" parameter).
>  
>  5. In the future, we may need to use Shards, creating a need to easily 
> partition our source data into Shards.
> Implementation Details:
>  1. De-couple EntityProcessorBase from caching.  
>   - Created a new interface, DIHCache & two implementations:  
> - SortedMapBackedCache - An in-memory cache, used as default with 
> CachedSqlEntityProcessor (now deprecated).
> - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
> with je-4.1.6.jar
>- NOTE: the existing Lucene Contrib "db" project uses je-3.3.93.jar.  
> I believe this may be incompatible due to Generic Usage.
>- NOTE: I did not modify the ant script to automatically get this jar, 
> so to use or evaluate this patch, download bdb-je from 
> http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
>  
>  2. Allow Entity Processors to take a "cacheImpl" parameter to cause the 
> entity data to be cached (see EntityProcessorBase & DIHCacheProperties).
>  
>  3. Partially De-couple SolrWriter from DocBuilder
>   - Created a new interface DIHWriter, & two implementations:
>- SolrWriter (refactored)
>- DIHCacheWriter (allows DIH to write ultimately to a Cache).
>
>  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
> persistent Cache as DIH Entity Input.
>  
>  5. Support a "partition" parameter with both DIHCacheWriter and 
> DIHCacheProcessor to allow for easy partitioning of source entity data.
>  
>  6. Change the semantics of entity.destroy()
>   - Previously, it was being called on each iteration of 
> DocBuilder.buildDocument().
>   - Now it is does one-time cleanup tasks (like closing or deleting a 
> disk-backed cache) once the entity processor is completed.
>   - The only out-of-the-box entity processor that previously implemented 
> destroy() was LineEntitiyProcessor, so this is not a very invasive change.
> General Notes:
> We are near completion in converting our search functionality from a legacy 
> search engine to Solr.  However, I f

[jira] [Updated] (LUCENE-777) SpanWithinQuery - A SpanNotQuery that allows a specified number of intersections

2011-07-21 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-777:
---

Attachment: LUCENE-777.patch

New patch with new test from Peter Keegan and a slight change to make that test 
pass.

> SpanWithinQuery - A SpanNotQuery that allows a specified number of 
> intersections
> 
>
> Key: LUCENE-777
> URL: https://issues.apache.org/jira/browse/LUCENE-777
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/search
>Reporter: Mark Miller
>Priority: Minor
> Attachments: LUCENE-777.patch, LUCENE-777.patch, SpanWithinQuery.java
>
>
> A SpanNotQuery that allows a specified number of intersections.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: building PyLucene 3.0.2 on Win7/MinGW with Python 2.7

2011-07-21 Thread Bill Janssen

Thomas Koch  wrote:

> Bill,
> I just read through your posting about the MinGW issues in "DLL Hell". Did
> you ever manage to get MinGW compile JCC and link against msvcr90.dll?

No, but I didn't try very hard.  Other things came up and I haven't
gotten back to it yet.

Bill

> I think I'm facing a similar issue (see post of today) and tried to change
> MinGW spec to use msvcr90 (as mentioned in
> http://www.mingw.org/wiki/HOWTO_Use_the_GCC_specs_file) but then "python
> setup.py build --compiler=mingw32" runs in to ldd issues (I guess that's
> because I still need to "hack" minGW itself...) - see attached output.
> 
> > I'll try that -- getting mingw to use the same C library that Python
> > uses.  Looks like you can do an in-place update -- the pyMinGW toolkit
> > provides a tool which does that.
> >
> (How) did you manage that?
> 
> Regards
> Thomas
> --
> Here's the output:
> 
> I:\Software\Python26\PyLucene\src\pylucene-3.3-2\jcc>python setup.py build
> --compiler=mingw32
> ...
> running build_py
> writing I:\Software\Python26\PyLucene\src\pylucene-3.3-2\jcc\jcc\config.py
> copying jcc\config.py -> build\lib.win32-2.6\jcc
> copying jcc\jcc.lib -> build\lib.win32-2.6\jcc
> copying jcc\classes\org\apache\jcc\PythonVM.class ->
> build\lib.win32-2.6\jcc\classes\org\apache\jcc
> copying jcc\classes\org\apache\jcc\PythonException.class ->
> build\lib.win32-2.6\jcc\classes\org\apache\jcc
> running build_ext
> building 'jcc' extension
> C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -D_jcc_lib -DJCC_VER="2.10"
> -IC:\Devel\Java\jdk1.6.0_14/include
> -IC:\Devel\Java\jdk1.6.0_14/include/win32 -I_jcc
>  -Ijcc/sources -IC:\Devel\Python26\include -IC:\Devel\Python26\PC -c
> jcc/sources/jcc.cpp -o build\temp.win32-2.6\Release\jcc\sources\jcc.o
> -DPYTHON -specs=msvcr
> 90 -fno-strict-aliasing -Wno-write-strings
> C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -D_jcc_lib -DJCC_VER="2.10"
> -IC:\Devel\Java\jdk1.6.0_14/include
> -IC:\Devel\Java\jdk1.6.0_14/include/win32 -I_jcc
>  -Ijcc/sources -IC:\Devel\Python26\include -IC:\Devel\Python26\PC -c
> jcc/sources/JCCEnv.cpp -o build\temp.win32-2.6\Release\jcc\sources\jccenv.o
> -DPYTHON -specs
> =msvcr90 -fno-strict-aliasing -Wno-write-strings
> writing build\temp.win32-2.6\Release\jcc\sources\jcc.def
> C:\MinGW\bin\g++.exe -mno-cygwin -shared
> -Wl,--out-implib,build\lib.win32-2.6\jcc\jcc.lib -s
> build\temp.win32-2.6\Release\jcc\sources\jcc.o build\temp.win32-2.6
> \Release\jcc\sources\jccenv.o
> build\temp.win32-2.6\Release\jcc\sources\jcc.def -LC:\Devel\Python26\libs
> -LC:\Devel\Python26\PCbuild -lpython26 -lmsvcr90 -o buil
> d\lib.win32-2.6\jcc.dll -LC:\Devel\Java\jdk1.6.0_14/lib -ljvm -Wl,-S
> -Wl,--out-implib,jcc\jcc.lib
> Creating library file: jcc\jcc.lib
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> xe): undefined reference to `GetModuleHandleA@4'
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> x23): undefined reference to `GetProcAddress@8'
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> x51): undefined reference to `GetModuleHandleA@4'
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> x66): undefined reference to `GetProcAddress@8'
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> x9a): undefined reference to `GetModuleHandleA@4'
> c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
> xaf): undefined reference to `GetProcAddress@8'
> build\temp.win32-2.6\Release\jcc\sources\jcc.o:jcc.cpp:(.text$_ZNK6JCCEnv10g
> et_vm_envEv[JCCEnv::get_vm_env() const]+0xf): undefined reference to
> `TlsGetValue@4'
> 
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x10):
> undefined reference to `TlsAlloc@0'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x29):
> undefined reference to `TlsSetValue@8'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x24d7):
> undefined reference to `InitializeCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x254c):
> undefined reference to `EnterCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x263a):
> undefined reference to `LeaveCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x2677):
> undefined reference to `LeaveCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x26f2):
> undefined reference to `EnterCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x27e5):
> undefined reference to `LeaveCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text$_ZN4lock
> D1Ev[lock::~lock()]+0x18): undefined reference to `LeaveCriticalSection@4'
> build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text$_ZN4lock
> D0Ev[lock::~lock()]+0x19): undefined reference

Re: [VOTE] Release PyLucene 3.3 (rc3)

2011-07-21 Thread Michael McCandless

+1 to release!

Smoke test passed and I see grouping module classes are visible by
default!  Thanks Andi :)

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jul 21, 2011 at 12:47 PM, Andi Vajda  wrote:
>
> A problem was found with rc2. Please, vote on rc3, thanks :-)
>
> The Apache PyLucene 3.3-3 release closely tracking the recent release of
> Apache Lucene Java 3.3 is ready.
>
> A release candidate is available from:
> http://people.apache.org/~vajda/staging_area/
>
> This new release candidate fixes an issue with wrapping the new grouping
> contrib module which is now part of the PyLucene build.
>
> A list of changes in this release can be seen at:
> http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGES
>
> PyLucene 3.3 is built with JCC 2.10 included in these release artifacts.
>
> A list of Lucene Java changes can be seen at:
> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/lucene/CHANGES.txt
>
> Please vote to release these artifacts as PyLucene 3.3-3.
>
> Thanks !
>
> Andi..
>
> ps: the KEYS file for PyLucene release signing is at:
>    http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
>    http://people.apache.org/~vajda/staging_area/KEYS
>
> pps: here is my +1
>

[VOTE] Release PyLucene 3.3 (rc3)

2011-07-21 Thread Andi Vajda



A problem was found with rc2. Please, vote on rc3, thanks :-)

The Apache PyLucene 3.3-3 release closely tracking the recent release of 
Apache Lucene Java 3.3 is ready.


A release candidate is available from:
http://people.apache.org/~vajda/staging_area/

This new release candidate fixes an issue with wrapping the new grouping 
contrib module which is now part of the PyLucene build.


A list of changes in this release can be seen at:
http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGES

PyLucene 3.3 is built with JCC 2.10 included in these release artifacts.

A list of Lucene Java changes can be seen at:
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/lucene/CHANGES.txt

Please vote to release these artifacts as PyLucene 3.3-3.

Thanks !

Andi..

ps: the KEYS file for PyLucene release signing is at:
http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS
http://people.apache.org/~vajda/staging_area/KEYS

pps: here is my +1

RE: building PyLucene 3.0.2 on Win7/MinGW with Python 2.7

2011-07-21 Thread Thomas Koch

Bill,
I just read through your posting about the MinGW issues in "DLL Hell". Did
you ever manage to get MinGW compile JCC and link against msvcr90.dll?

I think I'm facing a similar issue (see post of today) and tried to change
MinGW spec to use msvcr90 (as mentioned in
http://www.mingw.org/wiki/HOWTO_Use_the_GCC_specs_file) but then "python
setup.py build --compiler=mingw32" runs in to ldd issues (I guess that's
because I still need to "hack" minGW itself...) - see attached output.

> I'll try that -- getting mingw to use the same C library that Python
> uses.  Looks like you can do an in-place update -- the pyMinGW toolkit
> provides a tool which does that.
>
(How) did you manage that?

Regards
Thomas
--
Here's the output:

I:\Software\Python26\PyLucene\src\pylucene-3.3-2\jcc>python setup.py build
--compiler=mingw32
...
running build_py
writing I:\Software\Python26\PyLucene\src\pylucene-3.3-2\jcc\jcc\config.py
copying jcc\config.py -> build\lib.win32-2.6\jcc
copying jcc\jcc.lib -> build\lib.win32-2.6\jcc
copying jcc\classes\org\apache\jcc\PythonVM.class ->
build\lib.win32-2.6\jcc\classes\org\apache\jcc
copying jcc\classes\org\apache\jcc\PythonException.class ->
build\lib.win32-2.6\jcc\classes\org\apache\jcc
running build_ext
building 'jcc' extension
C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -D_jcc_lib -DJCC_VER="2.10"
-IC:\Devel\Java\jdk1.6.0_14/include
-IC:\Devel\Java\jdk1.6.0_14/include/win32 -I_jcc
 -Ijcc/sources -IC:\Devel\Python26\include -IC:\Devel\Python26\PC -c
jcc/sources/jcc.cpp -o build\temp.win32-2.6\Release\jcc\sources\jcc.o
-DPYTHON -specs=msvcr
90 -fno-strict-aliasing -Wno-write-strings
C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -D_jcc_lib -DJCC_VER="2.10"
-IC:\Devel\Java\jdk1.6.0_14/include
-IC:\Devel\Java\jdk1.6.0_14/include/win32 -I_jcc
 -Ijcc/sources -IC:\Devel\Python26\include -IC:\Devel\Python26\PC -c
jcc/sources/JCCEnv.cpp -o build\temp.win32-2.6\Release\jcc\sources\jccenv.o
-DPYTHON -specs
=msvcr90 -fno-strict-aliasing -Wno-write-strings
writing build\temp.win32-2.6\Release\jcc\sources\jcc.def
C:\MinGW\bin\g++.exe -mno-cygwin -shared
-Wl,--out-implib,build\lib.win32-2.6\jcc\jcc.lib -s
build\temp.win32-2.6\Release\jcc\sources\jcc.o build\temp.win32-2.6
\Release\jcc\sources\jccenv.o
build\temp.win32-2.6\Release\jcc\sources\jcc.def -LC:\Devel\Python26\libs
-LC:\Devel\Python26\PCbuild -lpython26 -lmsvcr90 -o buil
d\lib.win32-2.6\jcc.dll -LC:\Devel\Java\jdk1.6.0_14/lib -ljvm -Wl,-S
-Wl,--out-implib,jcc\jcc.lib
Creating library file: jcc\jcc.lib
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
xe): undefined reference to `GetModuleHandleA@4'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
x23): undefined reference to `GetProcAddress@8'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
x51): undefined reference to `GetModuleHandleA@4'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
x66): undefined reference to `GetProcAddress@8'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
x9a): undefined reference to `GetModuleHandleA@4'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/crtbegin.o:cygming-crtbegin.c:(.text+0
xaf): undefined reference to `GetProcAddress@8'
build\temp.win32-2.6\Release\jcc\sources\jcc.o:jcc.cpp:(.text$_ZNK6JCCEnv10g
et_vm_envEv[JCCEnv::get_vm_env() const]+0xf): undefined reference to
`TlsGetValue@4'

build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x10):
undefined reference to `TlsAlloc@0'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x29):
undefined reference to `TlsSetValue@8'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x24d7):
undefined reference to `InitializeCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x254c):
undefined reference to `EnterCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x263a):
undefined reference to `LeaveCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x2677):
undefined reference to `LeaveCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x26f2):
undefined reference to `EnterCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text+0x27e5):
undefined reference to `LeaveCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text$_ZN4lock
D1Ev[lock::~lock()]+0x18): undefined reference to `LeaveCriticalSection@4'
build\temp.win32-2.6\Release\jcc\sources\jccenv.o:JCCEnv.cpp:(.text$_ZN4lock
D0Ev[lock::~lock()]+0x19): undefined reference to `LeaveCriticalSection@4'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../libmingw32.a(tlssup.o):tlssup
.c:(.text+0xa8): undefined reference to `LoadLibraryA@4'
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../libmingw32.a(tlssup.o):tlssup
.c:(.text+0xc8): undefined reference to `GetProcAddress@8'
c:/ming

[jira] [Commented] (LUCENE-3328) Specialize BooleanQuery if all clauses are TermQueries

2011-07-21 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069061#comment-13069061
 ] 

Michael Busch commented on LUCENE-3328:
---

{quote}
The ConjunctionTermScorer sorts the DocsEnums by their frequency in the ctor. 
The leader will always be the lowest frequent term in the set. is this what you 
mean here?
{quote}

Cool, yeah that's roughly what I meant. In general, it's best to always pick 
the lowest-df enum as leader:
1) after initialization
2) after a hit was found
3) whenever a doc matched m out of n enums, 1 < m < n

I think what you described covers situation 1), does it also cover 2) and 3)?

> Specialize BooleanQuery if all clauses are TermQueries
> --
>
> Key: LUCENE-3328
> URL: https://issues.apache.org/jira/browse/LUCENE-3328
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 3.4, 4.0
>Reporter: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3328.patch, LUCENE-3328.patch, LUCENE-3328.patch
>
>
> During work on LUCENE-3319 I ran into issues with BooleanQuery compared to 
> PhraseQuery in the exact case. If I disable scoring on PhraseQuery and bypass 
> the position matching, essentially doing a conjunction match, 
> ExactPhraseScorer beats plain boolean scorer by 40% which is a sizeable gain. 
> I converted a ConjunctionScorer to use DocsEnum directly but still didn't get 
> all the 40% from PhraseQuery. Yet, it turned out with further optimizations 
> this gets very close to PhraseQuery. The biggest gain here came from 
> converting the hand crafted loop in ConjunctionScorer#doNext to a for loop 
> which seems to be less confusing to hotspot. In this particular case I think 
> code specialization makes lots of sense since BQ with TQ is by far one of the 
> most common queries.
> I will upload a patch shortly

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release PyLucene 3.3 (rc2)

2011-07-21 Thread Andi Vajda



On Thu, 21 Jul 2011, Michael McCandless wrote:


+1 to release.

I ran my usual smoke test (index first 100K wikipedia docs and run a
few searches).

The grouping module isn't actually enabled right?  I see this:

   JARS+=$(GROUPING_JAR)   # grouping module

But then GROUPING_JAR isn't defined anywhere, I think?  I went and
defined it (just copied how SPATIAL_JAR was set up), and was then able
to compile it just fine this time around.


Ouch, there goes that release candidate. My bad, I had a merging snafu.
Still, I tested the grouping build on the 3.x branch.
Producing rc3 in a bit...

My apologies and thanks for finding this !

Andi..



Mike McCandless

http://blog.mikemccandless.com

On Thu, Jul 21, 2011 at 10:50 AM, Thomas Koch  wrote:

Hi,
I've just tried to build PyLucene 3.3 on win32 and failed.
This may be unrelated to V3.3 though as I tried the build process with minGW
for the first time (used MSVC before)! Just wondering if I'm doing sth wrong
or anyone had this issue and can help:

Using
- Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit
(Intel)] on win32
- jdk1.6.0_14
- mingw32-gcc (gcc (GCC) 4.5.2)
- setuptools.__version__'0.6c11'
- msys-make (GNU Make 3.81)
- ant 1.7.1

I'm able to build (and install) JCC:
./jcc-2.10-py2.6-win32.egg

jcc.initVM()



Then "make" runs fine until the final "python -m jcc.__main__ xyz.jar ...
--build"
Here python.exe crashes (windows popup asking to send details to
somewhere...)
 make: *** [compile] Error 5

The only warnings I get are

which: icupkg: unknown command
ICU not installed
(is there a minGW package for that?)

This is Windows-Vista 32-bit. Attached is my Makefile chunk.

Any ideas? Has anyone yet tried to build PyLucene within
virtualenv(ironment) and is this suggested?

I fear the answer is "you have to build Python with MinGW" ,-(

I may get back to MSVC but would also like to get this running with MinGW...

Regards
Thomas
--
# Makefile [...]
# Windows   (Win32, msys/MinGW, Python 2.6.6, Java 1.6, ant 1.7.1
PREFIX_PYTHON=C:\\Devel\\Python26
ANT=C:\\Devel\\Eclipse\\eclipse36\\plugins\\org.apache.ant_1.7.1.v20100518-1
145\\bin\\ant
PYTHON=$(PREFIX_PYTHON)\\python.exe
JCC=$(PYTHON) -m jcc.__main__ --shared --compiler mingw32
NUM_FILES=3


Imports work to some extend ... here's make output with -v -d args for
python:
...
import jcc # directory
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
import jcc # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.py
import jcc.config # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
import jcc._jcc # dynamically loaded from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\_jcc.pyd
# c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.py
import jcc.cpp # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
# c:\Devel\Python26\lib\zipfile.pyc matches c:\Devel\Python26\lib\zipfile.py
import zipfile # precompiled from c:\Devel\Python26\lib\zipfile.pyc
# c:\Devel\Python26\lib\struct.pyc matches c:\Devel\Python26\lib\struct.py
import struct # precompiled from c:\Devel\Python26\lib\struct.pyc
import _struct # builtin
import time # builtin
# c:\Devel\Python26\lib\shutil.pyc matches c:\Devel\Python26\lib\shutil.py
import shutil # precompiled from c:\Devel\Python26\lib\shutil.pyc
# c:\Devel\Python26\lib\fnmatch.pyc matches c:\Devel\Python26\lib\fnmatch.py
import fnmatch # precompiled from c:\Devel\Python26\lib\fnmatch.pyc
import binascii # builtin
import cStringIO # builtin
import zlib # builtin
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.py
import jcc.python # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
# c:\Devel\Python26\lib\platform.pyc matches
c:\Devel\Python26\lib\platform.py
import platform # precompiled from c:\Devel\Python26\lib\platform.pyc
# c:\Devel\Python26\lib\string.pyc matches c:\Devel\Python26\lib\string.py
import string # precompiled from c:\Devel\Python26\lib\string.pyc
import strop # builtin
import itertools # builtin
make: *** [compile] Error 5


-Original Message-
From: Andi Vajda [mailto:va...@apache.org]
Sent: Thursday, July 21, 2011 12:47 PM
To: pylucene-...@lucene.apache.org
Cc: gene...@lucene.apache.org
Subject: [VOTE] Release PyLucene 3.3 (rc2)


The Apache PyLucene 3.3-2 release closely tracking the recent release of
Apache L

RE: [VOTE] Release PyLucene 3.3 (rc2)

2011-07-21 Thread Andi Vajda



 Hi Thomas,

I don't use mingw so I can't tell you much about it but others on this list 
do use it, they may have more to say...


About the 'icupkg' utility missing, this is ignorable, you just won't get 
the PyLucene/PyICU integration, which is not indispensable.


Andi..

On Thu, 21 Jul 2011, Thomas Koch wrote:


I've just tried to build PyLucene 3.3 on win32 and failed.
This may be unrelated to V3.3 though as I tried the build process with minGW
for the first time (used MSVC before)! Just wondering if I'm doing sth wrong
or anyone had this issue and can help:

Using
- Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit
(Intel)] on win32
- jdk1.6.0_14
- mingw32-gcc (gcc (GCC) 4.5.2)
- setuptools.__version__'0.6c11'
- msys-make (GNU Make 3.81)
- ant 1.7.1

I'm able to build (and install) JCC:
./jcc-2.10-py2.6-win32.egg

jcc.initVM()



Then "make" runs fine until the final "python -m jcc.__main__ xyz.jar ...
--build"
Here python.exe crashes (windows popup asking to send details to
somewhere...)
make: *** [compile] Error 5

The only warnings I get are

which: icupkg: unknown command
ICU not installed
(is there a minGW package for that?)

This is Windows-Vista 32-bit. Attached is my Makefile chunk.

Any ideas? Has anyone yet tried to build PyLucene within
virtualenv(ironment) and is this suggested?

I fear the answer is "you have to build Python with MinGW" ,-(

I may get back to MSVC but would also like to get this running with MinGW...

Regards
Thomas
--
# Makefile [...]
# Windows   (Win32, msys/MinGW, Python 2.6.6, Java 1.6, ant 1.7.1
PREFIX_PYTHON=C:\\Devel\\Python26
ANT=C:\\Devel\\Eclipse\\eclipse36\\plugins\\org.apache.ant_1.7.1.v20100518-1
145\\bin\\ant
PYTHON=$(PREFIX_PYTHON)\\python.exe
JCC=$(PYTHON) -m jcc.__main__ --shared --compiler mingw32
NUM_FILES=3


Imports work to some extend ... here's make output with -v -d args for
python:
...
import jcc # directory
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
import jcc # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.py
import jcc.config # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
import jcc._jcc # dynamically loaded from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\_jcc.pyd
# c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.py
import jcc.cpp # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
# c:\Devel\Python26\lib\zipfile.pyc matches c:\Devel\Python26\lib\zipfile.py
import zipfile # precompiled from c:\Devel\Python26\lib\zipfile.pyc
# c:\Devel\Python26\lib\struct.pyc matches c:\Devel\Python26\lib\struct.py
import struct # precompiled from c:\Devel\Python26\lib\struct.pyc
import _struct # builtin
import time # builtin
# c:\Devel\Python26\lib\shutil.pyc matches c:\Devel\Python26\lib\shutil.py
import shutil # precompiled from c:\Devel\Python26\lib\shutil.pyc
# c:\Devel\Python26\lib\fnmatch.pyc matches c:\Devel\Python26\lib\fnmatch.py
import fnmatch # precompiled from c:\Devel\Python26\lib\fnmatch.pyc
import binascii # builtin
import cStringIO # builtin
import zlib # builtin
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.py
import jcc.python # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
# c:\Devel\Python26\lib\platform.pyc matches
c:\Devel\Python26\lib\platform.py
import platform # precompiled from c:\Devel\Python26\lib\platform.pyc
# c:\Devel\Python26\lib\string.pyc matches c:\Devel\Python26\lib\string.py
import string # precompiled from c:\Devel\Python26\lib\string.pyc
import strop # builtin
import itertools # builtin
make: *** [compile] Error 5


-Original Message-
From: Andi Vajda [mailto:va...@apache.org]
Sent: Thursday, July 21, 2011 12:47 PM
To: pylucene-...@lucene.apache.org
Cc: gene...@lucene.apache.org
Subject: [VOTE] Release PyLucene 3.3 (rc2)


The Apache PyLucene 3.3-2 release closely tracking the recent release of
Apache Lucene Java 3.3 is ready.

A release candidate is available from:
http://people.apache.org/~vajda/staging_area/

This new release candidate fixes an issue with wrapping the new grouping
contrib module which is now part of the PyLucene build.

A list of changes in this release can be seen at:


http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGE
S


PyLucene 3.3 is built with JCC 2.10 included in these release artifacts.

Re: [VOTE] Release PyLucene 3.3 (rc2)

2011-07-21 Thread Michael McCandless

+1 to release.

I ran my usual smoke test (index first 100K wikipedia docs and run a
few searches).

The grouping module isn't actually enabled right?  I see this:

JARS+=$(GROUPING_JAR)   # grouping module

But then GROUPING_JAR isn't defined anywhere, I think?  I went and
defined it (just copied how SPATIAL_JAR was set up), and was then able
to compile it just fine this time around.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jul 21, 2011 at 10:50 AM, Thomas Koch  wrote:
> Hi,
> I've just tried to build PyLucene 3.3 on win32 and failed.
> This may be unrelated to V3.3 though as I tried the build process with minGW
> for the first time (used MSVC before)! Just wondering if I'm doing sth wrong
> or anyone had this issue and can help:
>
> Using
> - Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit
> (Intel)] on win32
> - jdk1.6.0_14
> - mingw32-gcc (gcc (GCC) 4.5.2)
> - setuptools.__version__'0.6c11'
> - msys-make (GNU Make 3.81)
> - ant 1.7.1
>
> I'm able to build (and install) JCC:
> ./jcc-2.10-py2.6-win32.egg
 jcc.initVM()
> 
>
> Then "make" runs fine until the final "python -m jcc.__main__ xyz.jar ...
> --build"
> Here python.exe crashes (windows popup asking to send details to
> somewhere...)
>  make: *** [compile] Error 5
>
> The only warnings I get are
>
> which: icupkg: unknown command
> ICU not installed
> (is there a minGW package for that?)
>
> This is Windows-Vista 32-bit. Attached is my Makefile chunk.
>
> Any ideas? Has anyone yet tried to build PyLucene within
> virtualenv(ironment) and is this suggested?
>
> I fear the answer is "you have to build Python with MinGW" ,-(
>
> I may get back to MSVC but would also like to get this running with MinGW...
>
> Regards
> Thomas
> --
> # Makefile [...]
> # Windows   (Win32, msys/MinGW, Python 2.6.6, Java 1.6, ant 1.7.1
> PREFIX_PYTHON=C:\\Devel\\Python26
> ANT=C:\\Devel\\Eclipse\\eclipse36\\plugins\\org.apache.ant_1.7.1.v20100518-1
> 145\\bin\\ant
> PYTHON=$(PREFIX_PYTHON)\\python.exe
> JCC=$(PYTHON) -m jcc.__main__ --shared --compiler mingw32
> NUM_FILES=3
>
>
> Imports work to some extend ... here's make output with -v -d args for
> python:
> ...
> import jcc # directory
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc
> #
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
> c
> matches
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
> import jcc # precompiled from
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
> c
> #
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
> matches
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.py
> import jcc.config # precompiled from
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
> import jcc._jcc # dynamically loaded from
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\_jcc.pyd
> # c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
> matches
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.py
> import jcc.cpp # precompiled from
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
> # c:\Devel\Python26\lib\zipfile.pyc matches c:\Devel\Python26\lib\zipfile.py
> import zipfile # precompiled from c:\Devel\Python26\lib\zipfile.pyc
> # c:\Devel\Python26\lib\struct.pyc matches c:\Devel\Python26\lib\struct.py
> import struct # precompiled from c:\Devel\Python26\lib\struct.pyc
> import _struct # builtin
> import time # builtin
> # c:\Devel\Python26\lib\shutil.pyc matches c:\Devel\Python26\lib\shutil.py
> import shutil # precompiled from c:\Devel\Python26\lib\shutil.pyc
> # c:\Devel\Python26\lib\fnmatch.pyc matches c:\Devel\Python26\lib\fnmatch.py
> import fnmatch # precompiled from c:\Devel\Python26\lib\fnmatch.pyc
> import binascii # builtin
> import cStringIO # builtin
> import zlib # builtin
> #
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
> matches
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.py
> import jcc.python # precompiled from
> c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
> # c:\Devel\Python26\lib\platform.pyc matches
> c:\Devel\Python26\lib\platform.py
> import platform # precompiled from c:\Devel\Python26\lib\platform.pyc
> # c:\Devel\Python26\lib\string.pyc matches c:\Devel\Python26\lib\string.py
> import string # precompiled from c:\Devel\Python26\lib\string.pyc
> import strop # builtin
> import itertools # builtin
> make: *** [compile] Error 5
>
>> -Original Message-
>> From: Andi Vajda [mailto:va...@apache.org]
>> Sent: Thursday, July 21, 2011 12:47 PM
>> To: pylucene-...@lucene.apache.org
>> Cc: gene...@lucene.apache.org
>> Subject: [VOTE] Release PyLucene 3.3 (rc2)
>>
>>
>> The Apache PyLucene 3.3-2 release closely tracking the recent release of
>> Apache Lucene Java 3.3 is ready.
>>

[jira] [Updated] (SOLR-2667) Finish Solr Admin UI

2011-07-21 Thread Stefan Matheis (steffkes) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2667:


Attachment: SOLR-2667-110721-analysis-exception.patch

bq. Found a minor issue: From the analysis page, pick a numeric field and put 
text into it. This will return a 500 with: java.lang.NumberFormatException: For 
input string: "asdgasg"

Oh, yes :/ Attached Patch (based on {{1149224}}) will fix that.

Even it's not bulletproof, it's using a regex to get the erorr-message from the 
response -- typical case for SOLR-141!

> Finish Solr Admin UI
> 
>
> Key: SOLR-2667
> URL: https://issues.apache.org/jira/browse/SOLR-2667
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 4.0
>
> Attachments: SOLR-2667-110721-analysis-exception.patch
>
>
> In SOLR-2399, we added a new admin UI. The issue has gotten too long to 
> follow, so this is a new issue to track remaining tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2667) Finish Solr Admin UI

2011-07-21 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069015#comment-13069015
 ] 

Ryan McKinley commented on SOLR-2667:
-

On the [query|http://localhost:8983/solr/#/singlecore/query] page...   I like 
that it keeps the query options next to the results, and that it shows the raw 
URL -- it would also be nice if the URL it displays was a direct link to that 
query.

What about including wt as a drop down?  xml/json/python/ruby/php/csv Maybe 
also a checkbox for &indent=true/false




> Finish Solr Admin UI
> 
>
> Key: SOLR-2667
> URL: https://issues.apache.org/jira/browse/SOLR-2667
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 4.0
>
>
> In SOLR-2399, we added a new admin UI. The issue has gotten too long to 
> follow, so this is a new issue to track remaining tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2667) Finish Solr Admin UI

2011-07-21 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069010#comment-13069010
 ] 

Ryan McKinley commented on SOLR-2667:
-

On the 
[plugins|http://localhost:/manage/index/solr/#/v0/plugins/cache?entry=fieldValueCache]
 page, I like the default accordian behavior -- what do you think about adding 
a button at the bottom that would 'show all details'  or something.  It is nice 
to be able to see all the cache values at once and just scroll though to see if 
anythign looks funny, rather then having to open each one.

> Finish Solr Admin UI
> 
>
> Key: SOLR-2667
> URL: https://issues.apache.org/jira/browse/SOLR-2667
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 4.0
>
>
> In SOLR-2399, we added a new admin UI. The issue has gotten too long to 
> follow, so this is a new issue to track remaining tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [VOTE] Release PyLucene 3.3 (rc2)

2011-07-21 Thread Thomas Koch

Hi,
I've just tried to build PyLucene 3.3 on win32 and failed.
This may be unrelated to V3.3 though as I tried the build process with minGW
for the first time (used MSVC before)! Just wondering if I'm doing sth wrong
or anyone had this issue and can help:

Using 
- Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit
(Intel)] on win32
- jdk1.6.0_14
- mingw32-gcc (gcc (GCC) 4.5.2)
- setuptools.__version__'0.6c11'
- msys-make (GNU Make 3.81)
- ant 1.7.1

I'm able to build (and install) JCC:
./jcc-2.10-py2.6-win32.egg
>>> jcc.initVM()


Then "make" runs fine until the final "python -m jcc.__main__ xyz.jar ...
--build"
Here python.exe crashes (windows popup asking to send details to
somewhere...)
 make: *** [compile] Error 5

The only warnings I get are

which: icupkg: unknown command
ICU not installed
(is there a minGW package for that?)

This is Windows-Vista 32-bit. Attached is my Makefile chunk.

Any ideas? Has anyone yet tried to build PyLucene within
virtualenv(ironment) and is this suggested?

I fear the answer is "you have to build Python with MinGW" ,-(

I may get back to MSVC but would also like to get this running with MinGW...

Regards
Thomas
--
# Makefile [...]
# Windows   (Win32, msys/MinGW, Python 2.6.6, Java 1.6, ant 1.7.1 
PREFIX_PYTHON=C:\\Devel\\Python26
ANT=C:\\Devel\\Eclipse\\eclipse36\\plugins\\org.apache.ant_1.7.1.v20100518-1
145\\bin\\ant
PYTHON=$(PREFIX_PYTHON)\\python.exe
JCC=$(PYTHON) -m jcc.__main__ --shared --compiler mingw32
NUM_FILES=3


Imports work to some extend ... here's make output with -v -d args for
python:
...
import jcc # directory
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
import jcc # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\__init__.py
c
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc 
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.py
import jcc.config # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\config.pyc
import jcc._jcc # dynamically loaded from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\_jcc.pyd
# c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc 
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.py
import jcc.cpp # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\cpp.pyc
# c:\Devel\Python26\lib\zipfile.pyc matches c:\Devel\Python26\lib\zipfile.py
import zipfile # precompiled from c:\Devel\Python26\lib\zipfile.pyc
# c:\Devel\Python26\lib\struct.pyc matches c:\Devel\Python26\lib\struct.py
import struct # precompiled from c:\Devel\Python26\lib\struct.pyc
import _struct # builtin
import time # builtin
# c:\Devel\Python26\lib\shutil.pyc matches c:\Devel\Python26\lib\shutil.py
import shutil # precompiled from c:\Devel\Python26\lib\shutil.pyc
# c:\Devel\Python26\lib\fnmatch.pyc matches c:\Devel\Python26\lib\fnmatch.py
import fnmatch # precompiled from c:\Devel\Python26\lib\fnmatch.pyc
import binascii # builtin
import cStringIO # builtin
import zlib # builtin
#
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc 
matches
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.py
import jcc.python # precompiled from
c:\Devel\Python26\lib\site-packages\jcc-2.10-py2.6-win32.egg\jcc\python.pyc
# c:\Devel\Python26\lib\platform.pyc matches
c:\Devel\Python26\lib\platform.py
import platform # precompiled from c:\Devel\Python26\lib\platform.pyc
# c:\Devel\Python26\lib\string.pyc matches c:\Devel\Python26\lib\string.py
import string # precompiled from c:\Devel\Python26\lib\string.pyc
import strop # builtin
import itertools # builtin
make: *** [compile] Error 5

> -Original Message-
> From: Andi Vajda [mailto:va...@apache.org]
> Sent: Thursday, July 21, 2011 12:47 PM
> To: pylucene-...@lucene.apache.org
> Cc: gene...@lucene.apache.org
> Subject: [VOTE] Release PyLucene 3.3 (rc2)
> 
> 
> The Apache PyLucene 3.3-2 release closely tracking the recent release of
> Apache Lucene Java 3.3 is ready.
> 
> A release candidate is available from:
> http://people.apache.org/~vajda/staging_area/
> 
> This new release candidate fixes an issue with wrapping the new grouping
> contrib module which is now part of the PyLucene build.
> 
> A list of changes in this release can be seen at:
>
http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGE
S
> 
> PyLucene 3.3 is built with JCC 2.10 included in these release artifacts.
> 
> A list of Lucene Java changes can be seen at:
>
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/lucene/CHANG
ES
> .txt
> 
> Please vote to release these artifacts as PyLucene 3.3-2.
> 
> Thanks !
> 
> Andi..
> 
> ps: the KEYS file for PyLucene relea

[jira] [Commented] (SOLR-2667) Finish Solr Admin UI

2011-07-21 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069002#comment-13069002
 ] 

Ryan McKinley commented on SOLR-2667:
-

Found a minor issue:  From the 
[analysis|http://localhost:8983/solr/#/singlecore/analysis] page, pick a 
numeric field and put text into it.  This will return a 500 with: 
java.lang.NumberFormatException: For input string: "asdgasg"

BUT the UI says "This Functionality requires the /analysis/field"

Looks like the error handling should catch 404 vs 500 (or maybe just non 200)

> Finish Solr Admin UI
> 
>
> Key: SOLR-2667
> URL: https://issues.apache.org/jira/browse/SOLR-2667
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
> Fix For: 4.0
>
>
> In SOLR-2399, we added a new admin UI. The issue has gotten too long to 
> follow, so this is a new issue to track remaining tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2399) Solr Admin Interface, reworked

2011-07-21 Thread Ryan McKinley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-2399.
-

Resolution: Fixed

This issue has become too large, so lets move subsequent work to SOLR-2667

> Solr Admin Interface, reworked
> --
>
> Key: SOLR-2399
> URL: https://issues.apache.org/jira/browse/SOLR-2399
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
> SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, 
> SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-110721.patch, 
> SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, 
> SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, 
> SOLR-2399-wip-notice.patch, SOLR-2399.patch
>
>
> *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
> Interface.* [Based on this 
> [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
> *Features:*
> * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
> * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
> * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
> * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
> SOLR-2400)
> * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
> * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
> * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
> * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
> * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
> * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
> ** Stub (using static data)
> Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
> I've quickly created a Github-Repository (Just for me, to keep track of the 
> changes)
> » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2667) Finish Solr Admin UI

2011-07-21 Thread Ryan McKinley (JIRA)

Finish Solr Admin UI


 Key: SOLR-2667
 URL: https://issues.apache.org/jira/browse/SOLR-2667
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 4.0


In SOLR-2399, we added a new admin UI. The issue has gotten too long to follow, 
so this is a new issue to track remaining tasks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3328) Specialize BooleanQuery if all clauses are TermQueries

2011-07-21 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068982#comment-13068982
 ] 

Simon Willnauer commented on LUCENE-3328:
-

I think this is ready. I will commit this tomorrow if nobody objects.

> Specialize BooleanQuery if all clauses are TermQueries
> --
>
> Key: LUCENE-3328
> URL: https://issues.apache.org/jira/browse/LUCENE-3328
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 3.4, 4.0
>Reporter: Simon Willnauer
> Fix For: 4.0
>
> Attachments: LUCENE-3328.patch, LUCENE-3328.patch, LUCENE-3328.patch
>
>
> During work on LUCENE-3319 I ran into issues with BooleanQuery compared to 
> PhraseQuery in the exact case. If I disable scoring on PhraseQuery and bypass 
> the position matching, essentially doing a conjunction match, 
> ExactPhraseScorer beats plain boolean scorer by 40% which is a sizeable gain. 
> I converted a ConjunctionScorer to use DocsEnum directly but still didn't get 
> all the 40% from PhraseQuery. Yet, it turned out with further optimizations 
> this gets very close to PhraseQuery. The biggest gain here came from 
> converting the hand crafted loop in ConjunctionScorer#doNext to a for loop 
> which seems to be less confusing to hotspot. In this particular case I think 
> code specialization makes lots of sense since BQ with TQ is by far one of the 
> most common queries.
> I will upload a patch shortly

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9678 - Failure

2011-07-21 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9678/

1 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testMultiCore

Error Message:
Index directory exists after core unload with deleteIndex=true

Stack Trace:
junit.framework.AssertionFailedError: Index directory exists after core unload 
with deleteIndex=true
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1299)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1217)
at 
org.apache.solr.client.solrj.MultiCoreExampleTestBase.testMultiCore(MultiCoreExampleTestBase.java:163)




Build Log (for compile errors):
[...truncated 13475 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2642) Sorting by function fails when using result grouping

2011-07-21 Thread Thomas Heigl (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068949#comment-13068949
 ] 

Thomas Heigl commented on SOLR-2642:


Thanks Martijn! Works perfectly.

> Sorting by function fails when using result grouping
> 
>
> Key: SOLR-2642
> URL: https://issues.apache.org/jira/browse/SOLR-2642
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.3
>Reporter: Thomas Heigl
>Assignee: Martijn van Groningen
> Fix For: 3.4, 4.0
>
>
> When using result grouping, sorting by distance with geodist() fails because 
> of missing weights for sorts.
> A an example of a failing query on an index with standard schema.xml looks 
> like this:
> {code}
> q=*:*&group=true&group.field=user.uniqueId_s&group.main=true&group.format=grouped&sfield=user.location_p&pt=48.20927,16.3728&sort=geodist()
>  asc
> {code}
> The exception thrown is:
> {code}
> Caused by: org.apache.solr.common.SolrException: Unweighted use of sort 
> geodist(latlon(user.location_p),48.20927,16.3728)
>   at 
> org.apache.solr.search.function.ValueSource$1.newComparator(ValueSource.java:106)
>   at org.apache.lucene.search.SortField.getComparator(SortField.java:413)
>   at 
> org.apache.lucene.search.grouping.AbstractFirstPassGroupingCollector.(AbstractFirstPassGroupingCollector.java:81)
>   at 
> org.apache.lucene.search.grouping.TermFirstPassGroupingCollector.(TermFirstPassGroupingCollector.java:56)
>   at 
> org.apache.solr.search.Grouping$CommandField.createFirstPassCollector(Grouping.java:587)
>   at org.apache.solr.search.Grouping.execute(Grouping.java:256)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:237)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>   at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:140)
>   ... 39 more
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-64) strict hierarchical facets

2011-07-21 Thread Manuel (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068945#comment-13068945
 ] 

Manuel edited comment on SOLR-64 at 7/21/11 12:10 PM:
--

I've applied the patch to the latest 3.x branch and it's working great for solr 
itself.
However, *solrj* fails when it tries to parse the facets.
{{java.lang.ClassCastException: org.apache.solr.common.util.NamedList cannot be 
cast to java.lang.Number}}
{{at 
org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:212)}}

Any chance, this patch can be updated to support solrj as well?

  was (Author: mabr):
I've applied the patch to the latest 3.x branch and it's working great.
However, solrj fails when it tries to parse the facets.
{{java.lang.ClassCastException: org.apache.solr.common.util.NamedList cannot be 
cast to java.lang.Number
at 
org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:212)}}

Any chance, this patch can be updated to support solrj as well?
  
> strict hierarchical facets
> --
>
> Key: SOLR-64
> URL: https://issues.apache.org/jira/browse/SOLR-64
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
>Assignee: Koji Sekiguchi
> Fix For: 4.0
>
> Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, 
> SOLR-64.patch, SOLR-64_3.1.0.patch
>
>
> Strict Facet Hierarchies... each tag has at most one parent (a tree).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-64) strict hierarchical facets

2011-07-21 Thread Manuel (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068945#comment-13068945
 ] 

Manuel commented on SOLR-64:


I've applied the patch to the latest 3.x branch and it's working great.
However, solrj fails when it tries to parse the facets.
{{java.lang.ClassCastException: org.apache.solr.common.util.NamedList cannot be 
cast to java.lang.Number
at 
org.apache.solr.client.solrj.response.QueryResponse.extractFacetInfo(QueryResponse.java:212)}}

Any chance, this patch can be updated to support solrj as well?

> strict hierarchical facets
> --
>
> Key: SOLR-64
> URL: https://issues.apache.org/jira/browse/SOLR-64
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
>Assignee: Koji Sekiguchi
> Fix For: 4.0
>
> Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, 
> SOLR-64.patch, SOLR-64_3.1.0.patch
>
>
> Strict Facet Hierarchies... each tag has at most one parent (a tree).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2011-07-21 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068925#comment-13068925
 ] 

Ahmet Arslan commented on SOLR-2649:


I experienced the same issue.  When i added one negative clause to the query 
string (that has two optional clauses), mm is ignored and default operator is 
used instead.
q=word1 word2 -word3&mm=100%&defType=edismax 
and 
q=word1 word2 -word3&mm=100%&defType=dismax 
returns different result sets. 

edismax returns documents containing either word1 or word2, although there are 
two optional clauses in the query and mm is set to 100%.

> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.3
>Reporter: Magnus Bergmark
>Priority: Minor
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The behavior seems to be intentional, although the reason why is never 
> explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from 
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the 
> primary features of dismax.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2382) DIH Cache Improvements

2011-07-21 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068921#comment-13068921
 ] 

Noble Paul commented on SOLR-2382:
--

@James can you please update the next patch (SOLR-2382-solrwriter.patch) to 
trunk?

> DIH Cache Improvements
> --
>
> Key: SOLR-2382
> URL: https://issues.apache.org/jira/browse/SOLR-2382
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Reporter: James Dyer
>Priority: Minor
> Attachments: SOLR-2382-dihwriter.patch, SOLR-2382-entities.patch, 
> SOLR-2382-properties.patch, SOLR-2382-properties.patch, 
> SOLR-2382-solrwriter.patch, SOLR-2382-solrwriter.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, 
> SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch
>
>
> Functionality:
>  1. Provide a pluggable caching framework for DIH so that users can choose a 
> cache implementation that best suits their data and application.
>  
>  2. Provide a means to temporarily cache a child Entity's data without 
> needing to create a special cached implementation of the Entity Processor 
> (such as CachedSqlEntityProcessor).
>  
>  3. Provide a means to write the final (root entity) DIH output to a cache 
> rather than to Solr.  Then provide a way for a subsequent DIH call to use the 
> cache as an Entity input.  Also provide the ability to do delta updates on 
> such persistent caches.
>  
>  4. Provide the ability to partition data across multiple caches that can 
> then be fed back into DIH and indexed either to varying Solr Shards, or to 
> the same Core in parallel.
> Use Cases:
>  1. We needed a flexible & scalable way to temporarily cache child-entity 
> data prior to joining to parent entities.
>   - Using SqlEntityProcessor with Child Entities can cause an "n+1 select" 
> problem.
>   - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching 
> mechanism and does not scale.
>   - There is no way to cache non-SQL inputs (ex: flat files, xml, etc).
>  
>  2. We needed the ability to gather data from long-running entities by a 
> process that runs separate from our main indexing process.
>   
>  3. We wanted the ability to do a delta import of only the entities that 
> changed.
>   - Lucene/Solr requires entire documents to be re-indexed, even if only a 
> few fields changed.
>   - Our data comes from 50+ complex sql queries and/or flat files.
>   - We do not want to incur overhead re-gathering all of this data if only 1 
> entity's data changed.
>   - Persistent DIH caches solve this problem.
>   
>  4. We want the ability to index several documents in parallel (using 1.4.1, 
> which did not have the "threads" parameter).
>  
>  5. In the future, we may need to use Shards, creating a need to easily 
> partition our source data into Shards.
> Implementation Details:
>  1. De-couple EntityProcessorBase from caching.  
>   - Created a new interface, DIHCache & two implementations:  
> - SortedMapBackedCache - An in-memory cache, used as default with 
> CachedSqlEntityProcessor (now deprecated).
> - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested 
> with je-4.1.6.jar
>- NOTE: the existing Lucene Contrib "db" project uses je-3.3.93.jar.  
> I believe this may be incompatible due to Generic Usage.
>- NOTE: I did not modify the ant script to automatically get this jar, 
> so to use or evaluate this patch, download bdb-je from 
> http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 
>  
>  2. Allow Entity Processors to take a "cacheImpl" parameter to cause the 
> entity data to be cached (see EntityProcessorBase & DIHCacheProperties).
>  
>  3. Partially De-couple SolrWriter from DocBuilder
>   - Created a new interface DIHWriter, & two implementations:
>- SolrWriter (refactored)
>- DIHCacheWriter (allows DIH to write ultimately to a Cache).
>
>  4. Create a new Entity Processor, DIHCacheProcessor, which reads a 
> persistent Cache as DIH Entity Input.
>  
>  5. Support a "partition" parameter with both DIHCacheWriter and 
> DIHCacheProcessor to allow for easy partitioning of source entity data.
>  
>  6. Change the semantics of entity.destroy()
>   - Previously, it was being called on each iteration of 
> DocBuilder.buildDocument().
>   - Now it is does one-time cleanup tasks (like closing or deleting a 
> disk-backed cache) once the entity processor is completed.
>   - The only out-of-the-box entity processor that previously implemented 
> destroy() was LineEntitiyProcessor, so this is not a very invasive change.
> General Notes:
> We are near completion in converting our search functionality from a legacy 
> search engine to Solr.  However, I found that DIH did not support caching to 
> the level

[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked

2011-07-21 Thread Stefan Matheis (steffkes) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2399:


Attachment: SOLR-2399-110721.patch

Ryan,

there was not really progress since a few days ... but i realized, that the 
directory structure was changed .. so i've updated the last patch, it's now 
based on SVN-Rev {{1149113}}.

Closing this ticket and continue with smaller todo-tickets sounds good, yes :)

Stefan

> Solr Admin Interface, reworked
> --
>
> Key: SOLR-2399
> URL: https://issues.apache.org/jira/browse/SOLR-2399
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
> SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-110702.patch, 
> SOLR-2399-110702.patch, SOLR-2399-110702.patch, SOLR-2399-110721.patch, 
> SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, 
> SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, 
> SOLR-2399-wip-notice.patch, SOLR-2399.patch
>
>
> *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
> Interface.* [Based on this 
> [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
> *Features:*
> * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
> * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
> * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
> * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
> SOLR-2400)
> * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
> * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
> * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
> * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
> * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
> * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
> ** Stub (using static data)
> Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
> I've quickly created a Github-Repository (Just for me, to keep track of the 
> changes)
> » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2606) Solr sort no longer works on field names with some punctuation in them

2011-07-21 Thread Tor Henning Ueland (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068909#comment-13068909
 ] 

Tor Henning Ueland commented on SOLR-2606:
--

Confirming this issue for 3.2 (3.2-SNAPSHOT 1133561)
Data can be loaded from fields (fl), the issue only appears when trying to sort 
on such fields.

> Solr sort no longer works on field names with some punctuation in them
> --
>
> Key: SOLR-2606
> URL: https://issues.apache.org/jira/browse/SOLR-2606
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.1, 3.2
> Environment: Linux
>Reporter: Mitsu Hadeishi
>
> We just upgraded from Solr 1.4 to 3.2. For the most part the upgrade went 
> fine, however we discovered that sorting on field names with dashes in them 
> is no longer working properly. For example, the following query used to work:
> http://[our solr server]/select/?q=computer&sort=static-need-binary+asc
> and now it gives this error:
> HTTP Status 400 - undefined field static
> type Status report
> message undefined field static
> description The request sent by the client was syntactically incorrect 
> (undefined field static).
> It appears the parser for sorting has been changed so that it now tokenizes 
> differently, and assumes field names cannot have dashes in them. However, 
> field names clearly can have dashes in them. The exact same query which 
> worked fine for us in 1.4 is now breaking in 3.2. Changing the sort field to 
> use a field name that doesn't have a dash in it works just fine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2655) DIH multi threaded mode does not resolves attributes correctly

2011-07-21 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-2655.
-

Resolution: Fixed

Committed revision 1149108 on trunk and 1149112 on branch_3x.

> DIH multi threaded mode does not resolves attributes correctly
> --
>
> Key: SOLR-2655
> URL: https://issues.apache.org/jira/browse/SOLR-2655
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: SOLR-2655-branch_3x.patch, SOLR-2655.patch, 
> SOLR-2655.patch, SOLR-2655.patch
>
>
> DIH multi-threaded mode sometimes fails to resolve entity attributes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2655) DIH multi threaded mode does not resolves attributes correctly

2011-07-21 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-2655:


Attachment: SOLR-2655.patch

Cleaned up the tests a bit.

I'll commit this shortly.

> DIH multi threaded mode does not resolves attributes correctly
> --
>
> Key: SOLR-2655
> URL: https://issues.apache.org/jira/browse/SOLR-2655
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: SOLR-2655-branch_3x.patch, SOLR-2655.patch, 
> SOLR-2655.patch, SOLR-2655.patch
>
>
> DIH multi-threaded mode sometimes fails to resolve entity attributes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-07-21 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068892#comment-13068892
 ] 

Chris Male commented on SOLR-2242:
--

I'm just jumping into this issue and considering the problem of loading all 
constraints just to get their size (or in fact, not wanting to do this).  Is 
there scope in the SimpleFacets to have some sort of 'Collector' idea added? 
That way it would be easy to choose if we want to collect the constraints, 
their counts and the total number of constraints, or whether we just want to 
total number.

Does anybody have any thoughts on the distribution issue?

> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: NumFacetTermsFacetsTest.java, 
> SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, 
> SOLR-2242.shard.patch, SOLR-2242.shard.patch, 
> SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, 
> SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2588) Make Velocity an optional dependency in SolrCore

2011-07-21 Thread Gunnar Wagenknecht (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068877#comment-13068877
 ] 

Gunnar Wagenknecht commented on SOLR-2588:
--

I like the proposal of making Velocity a module that is included and configured 
by default in the war file but not required for {{solr-core.jar}} to work. That 
would definitely solve my use-case where I'm embedding Solr in an application 
and don't want to include Velocity.

> Make Velocity an optional dependency in SolrCore
> 
>
> Key: SOLR-2588
> URL: https://issues.apache.org/jira/browse/SOLR-2588
> Project: Solr
>  Issue Type: Wish
>Affects Versions: 3.2
>Reporter: Gunnar Wagenknecht
>Assignee: David Smiley
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: SOLR-2588_Don_t_fail_if_velocity_libs_not_present_.patch
>
>
> In 1.4. it was fine to run Solr without Velocity on the classpath. However, 
> in 3.2. SolrCore won't load because of a hard reference to the Velocity 
> response writer in a static initializer.
> {noformat}
> ... ERROR org.apache.solr.core.CoreContainer - 
> java.lang.NoClassDefFoundError: org/apache/velocity/context/Context
>   at org.apache.solr.core.SolrCore.(SolrCore.java:1447)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-07-21 Thread Ahmet Arslan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-1604:
---

Attachment: ComplexPhrase.zip

Update for solr 3.3.0
* Download apache-solr-3.3.0-src.tgz
* Download most latest ComplexPhrase.zip
* 'mvn package' will generate 3 files under target folder
copy them to apache-solr-3.3.0/solr/lib/
** cp target/ComplexPhrase-* Downloads/apache-solr-3.3.0/solr/lib/
* call 'ant clean dist' to create a new apache-solr-3.3-SNAPSHOT.war file under 
solr/dist folder

> Wildcards, ORs etc inside Phrase Queries
> 
>
> Key: SOLR-1604
> URL: https://issues.apache.org/jira/browse/SOLR-1604
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Ahmet Arslan
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
> ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, 
> SOLR-1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
> wildcards, ORs, ranges, fuzzies inside phrase queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2584) Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert

2011-07-21 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-2584.
--

Resolution: Fixed

committed in trunk and 3x.

> Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on 
> insert
> --
>
> Key: SOLR-2584
> URL: https://issues.apache.org/jira/browse/SOLR-2584
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4.1, 3.3, 4.0
>Reporter: Elmer Garduno
>Assignee: Koji Sekiguchi
>Priority: Minor
>  Labels: uima
> Fix For: 3.4, 4.0
>
> Attachments: SOLR-2584.patch, SOLR-2584.patch, SOLR-2584.patch
>
>
> Hi folks, 
> I think that UIMAUpdateRequestProcessor should have a parameter to avoid 
> duplicate values on the updated field. 
> A typical use case is:
> If you are using DictionaryAnnotator and there is a term that matches more 
> than once it will be added two times in the mapped field. I think that we 
> should add a parameter to avoid inserting duplicates as we are not preserving 
> information on the position of the annotation. 
> What do you think about it? I've already implemented this for branch 3x I'm 
> writing some tests and I will submit a patch.
> Regards

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

44 matches

Mail list logo