[jira] Assigned: (SOLR-1096) Java Replication stalls and never exits

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1096:
---

Assignee: Shalin Shekhar Mangar

> Java Replication stalls and never exits
> ---
>
> Key: SOLR-1096
> URL: https://issues.apache.org/jira/browse/SOLR-1096
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOR-1096.patch
>
>
> replication hangs
> mail thread : http://markmail.org/thread/xgbptpzn52xprmwo
> The stacktrace 
> {code}
> user time=23940.ms at java.net.SocketInputStream.socketRead0(Native 
> Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129) 
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) 
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) 
> at 
> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.jav 
> a:182) 
> at java.io.FilterInputStream.read(FilterInputStream.java:116) 
> at 
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:108)
>  
> at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:91) 
> at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1 
> 22) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java: 
> 808) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:764 
> ) 
> ..
> {code}
> the httpclient is created w/o a read_timeout & connection_timeout. . So it 
> may hang indefinitely if there is no data coming out of the server

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Solr nightly build failure

2009-04-13 Thread solr-dev

init-forrest-entities:
[mkdir] Created dir: /tmp/apache-solr-nightly/build
[mkdir] Created dir: /tmp/apache-solr-nightly/build/web

compile-solrj:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj
[javac] Compiling 76 source files to /tmp/apache-solr-nightly/build/solrj
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compile:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/solr
[javac] Compiling 365 source files to /tmp/apache-solr-nightly/build/solr
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

compileTests:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/tests
[javac] Compiling 150 source files to /tmp/apache-solr-nightly/build/tests
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

junit:
[mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results
[junit] Running org.apache.solr.BasicFunctionalityTest
[junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 20.036 sec
[junit] Running org.apache.solr.ConvertedLegacyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 10.297 sec
[junit] Running org.apache.solr.DisMaxRequestHandlerTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.175 sec
[junit] Running org.apache.solr.EchoParamsTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 2.781 sec
[junit] Running org.apache.solr.OutputWriterTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.996 sec
[junit] Running org.apache.solr.SampleTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.892 sec
[junit] Running org.apache.solr.SolrInfoMBeanTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.918 sec
[junit] Running org.apache.solr.TestDistributedSearch
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 35.621 sec
[junit] Running org.apache.solr.TestTrie
[junit] 
[junit] 0111.01995-12-31T23:59:59.999Z0.02009-04-13T00:00:00Z1.02009-04-14T00:00:00Z2.02009-04-15T00:00:00Z3.02009-04-16T00:00:00Z4.02009-04-17T00:00:00Z5.02009-04-18T00:00:00Z6.02009-04-19T00:00:00Z7.02009-04-20T00:00:00Z8.02009-04-21T00:00:00Z
[junit] 
[junit] )
[junit] Tests run: 7, Failures: 1, Errors: 0, Time elapsed: 8.495 sec
[junit] Test org.apache.solr.TestTrie FAILED
[junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.381 sec
[junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.333 sec
[junit] Running org.apache.solr.analysis.EnglishPorterFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.415 sec
[junit] Running org.apache.solr.analysis.HTMLStripReaderTest
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.979 sec
[junit] Running org.apache.solr.analysis.LengthFilterTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.957 sec
[junit] Running org.apache.solr.analysis.SnowballPorterFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.506 sec
[junit] Running org.apache.solr.analysis.TestBufferedTokenStream
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.089 sec
[junit] Running org.apache.solr.analysis.TestCapitalizationFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.037 sec
[junit] Running org.apache.solr.analysis.TestCharFilter
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.345 sec
[junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.885 sec
[junit] Running org.apache.solr.analysis.TestKeepFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.597 sec
[junit] Running org.apache.solr.analysis.TestKeepWordFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.921 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilter
[junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 0.363 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.378 sec
[junit] Runn

Build failed in Hudson: Solr-trunk #770

2009-04-13 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/770/changes

Changes:

[koji] SOLR-620: added a feature of applying velocity.properties by 
v.properties parameter

--
[...truncated 1976 lines...]
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.507 sec
[junit] Running org.apache.solr.analysis.EnglishPorterFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 3.774 sec
[junit] Running org.apache.solr.analysis.HTMLStripReaderTest
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 1.682 sec
[junit] Running org.apache.solr.analysis.LengthFilterTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.205 sec
[junit] Running org.apache.solr.analysis.SnowballPorterFilterFactoryTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.525 sec
[junit] Running org.apache.solr.analysis.TestBufferedTokenStream
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.138 sec
[junit] Running org.apache.solr.analysis.TestCapitalizationFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.434 sec
[junit] Running org.apache.solr.analysis.TestCharFilter
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.318 sec
[junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.052 sec
[junit] Running org.apache.solr.analysis.TestKeepFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.115 sec
[junit] Running org.apache.solr.analysis.TestKeepWordFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.047 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilter
[junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 0.43 sec
[junit] Running org.apache.solr.analysis.TestMappingCharFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.379 sec
[junit] Running org.apache.solr.analysis.TestPatternReplaceFilter
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 2.959 sec
[junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.187 sec
[junit] Running org.apache.solr.analysis.TestPhoneticFilter
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.248 sec
[junit] Running org.apache.solr.analysis.TestRemoveDuplicatesTokenFilter
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.75 sec
[junit] Running org.apache.solr.analysis.TestStopFilterFactory
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.643 sec
[junit] Running org.apache.solr.analysis.TestSynonymFilter
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 1.891 sec
[junit] Running org.apache.solr.analysis.TestSynonymMap
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.876 sec
[junit] Running org.apache.solr.analysis.TestTrimFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.082 sec
[junit] Running org.apache.solr.analysis.TestWordDelimiterFilter
[junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 13.7 sec
[junit] Running org.apache.solr.client.solrj.SolrExceptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.828 sec
[junit] Running org.apache.solr.client.solrj.SolrQueryTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.33 sec
[junit] Running org.apache.solr.client.solrj.TestBatchUpdate
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 11.347 sec
[junit] Running org.apache.solr.client.solrj.TestLBHttpSolrServer
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.924 sec
[junit] Running org.apache.solr.client.solrj.beans.TestDocumentObjectBinder
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.504 sec
[junit] Running org.apache.solr.client.solrj.embedded.JettyWebappTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.655 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.LargeVolumeBinaryJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.066 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.975 sec
[junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.735 sec
[junit] Running org.apache.solr.client.solrj.embedded.MultiCoreEmbeddedTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.283 sec
[junit] Running 
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.034 sec
[junit] Running 
org.apache.solr.client.solrj.embedd

[jira] Updated: (SOLR-1096) Java Replication stalls and never exits

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1096:


Attachment: SOLR-1096.patch

# Introduces httpConnTimeout and httpReadTimeout configuration parameters
# HttpClient is created per core only if timeout is specified. Otherwise the 
core uses the static shared instance

I plan to commit shortly.

> Java Replication stalls and never exits
> ---
>
> Key: SOLR-1096
> URL: https://issues.apache.org/jira/browse/SOLR-1096
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-1096.patch, SOR-1096.patch
>
>
> replication hangs
> mail thread : http://markmail.org/thread/xgbptpzn52xprmwo
> The stacktrace 
> {code}
> user time=23940.ms at java.net.SocketInputStream.socketRead0(Native 
> Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129) 
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) 
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) 
> at 
> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.jav 
> a:182) 
> at java.io.FilterInputStream.read(FilterInputStream.java:116) 
> at 
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:108)
>  
> at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:91) 
> at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1 
> 22) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java: 
> 808) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:764 
> ) 
> ..
> {code}
> the httpclient is created w/o a read_timeout & connection_timeout. . So it 
> may hang indefinitely if there is no data coming out of the server

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1096) Java Replication stalls and never exits

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1096.
-

Resolution: Fixed

Committed revision 764371.

> Java Replication stalls and never exits
> ---
>
> Key: SOLR-1096
> URL: https://issues.apache.org/jira/browse/SOLR-1096
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-1096.patch, SOR-1096.patch
>
>
> replication hangs
> mail thread : http://markmail.org/thread/xgbptpzn52xprmwo
> The stacktrace 
> {code}
> user time=23940.ms at java.net.SocketInputStream.socketRead0(Native 
> Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129) 
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) 
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) 
> at 
> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.jav 
> a:182) 
> at java.io.FilterInputStream.read(FilterInputStream.java:116) 
> at 
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:108)
>  
> at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:91) 
> at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:1 
> 22) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchPackets(SnapPuller.java: 
> 808) 
> at 
> org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:764 
> ) 
> ..
> {code}
> the httpclient is created w/o a read_timeout & connection_timeout. . So it 
> may hang indefinitely if there is no data coming out of the server

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1059) Add special variables for deleting documents, skipping rows and transforms in DIH

2009-04-13 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1059:
-

Attachment: SOLR-1059.patch

cleaned up a bit

> Add special variables for deleting documents, skipping rows and transforms in 
> DIH
> -
>
> Key: SOLR-1059
> URL: https://issues.apache.org/jira/browse/SOLR-1059
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch, 
> SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch
>
>
> There is no means to delete docs in DIH.
> add two special variables 
> # $deleteDocId
> # $deleteDocQuery
> if the returned row contains these fields DIH will delete docs by id or query 
> depending on what is present

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1059) Add special variables for deleting documents, skipping rows and transforms in DIH

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1059.
-

Resolution: Fixed

Committed revision 764379.

Thanks Noble!

> Add special variables for deleting documents, skipping rows and transforms in 
> DIH
> -
>
> Key: SOLR-1059
> URL: https://issues.apache.org/jira/browse/SOLR-1059
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch, 
> SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch, SOLR-1059.patch
>
>
> There is no means to delete docs in DIH.
> add two special variables 
> # $deleteDocId
> # $deleteDocQuery
> if the returned row contains these fields DIH will delete docs by id or query 
> depending on what is present

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-620) Velocity Response Writer

2009-04-13 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-620.
---

   Resolution: Fixed
Fix Version/s: 1.4

I think we can resolve this issue and open new ones for any additional work 
needed on the VelocityResponseWriter.

> Velocity Response Writer
> 
>
> Key: SOLR-620
> URL: https://issues.apache.org/jira/browse/SOLR-620
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-620-velocity.properties.patch, 
> SOLR-620-velocity.properties.patch, SOLR-620.patch, SOLR-620.patch, 
> SOLR-620.patch, SOLR-620.patch, SOLR-620.patch, SOLR-620.zip, SOLR-620.zip
>
>
> Add a Velocity - http://velocity.apache.org - response writer, making it 
> possible to generate a decent search UI straight from Solr itself.  Designed 
> to work standalone or in conjunction with the JSON response writer (or 
> SolrJS) for Ajaxy things.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1111) fix FieldCache usage in Solr

2009-04-13 Thread Yonik Seeley (JIRA)
fix FieldCache usage in Solr


 Key: SOLR-
 URL: https://issues.apache.org/jira/browse/SOLR-
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
 Fix For: 1.4


Recent changes in Lucene have altered how the FieldCache is used and as-is 
could lead to previously working Solr installations blowing up when they 
upgrade to 1.4.  We need to fix, or document the affects of these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1111) fix FieldCache usage in Solr

2009-04-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698399#action_12698399
 ] 

Yonik Seeley commented on SOLR-:


The major issue is that Lucene now creates scorers per-segment, and if you use 
Lucene's searcher.search(...,sort) then the FieldCache populations will also be 
per-segment.

The biggest issue:  If FieldCache get's populated at both the top-level reader 
and per-segment, memory usage doubles (as does un-inversion time).
 - Faceting on single-valued fields uses the FieldCache at the top-level (and 
would be
   - This is non-trivial to change...  if we started counting per-segment, 
counts would somehow have to be merged across segments.
 - Sorting in Solr currently uses the FieldCache at the top level
   - This can't easily be changed to use Lucene's searcher.search(...,sort) 
since we are using a hit collector (which can be wrapped in a time limited 
collector).
 - Distributed search uses the top-level FieldCache to retrieve sort field 
values.
 - FunctionQuery now derives values at the segment level
   - This also applies to the function range query

> fix FieldCache usage in Solr
> 
>
> Key: SOLR-
> URL: https://issues.apache.org/jira/browse/SOLR-
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
> Fix For: 1.4
>
>
> Recent changes in Lucene have altered how the FieldCache is used and as-is 
> could lead to previously working Solr installations blowing up when they 
> upgrade to 1.4.  We need to fix, or document the affects of these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



SOLR-1106 - Custom Admin Action handler

2009-04-13 Thread Kay Kay
For one of our projects - we need custom admin monitoring hooks that gets
access to multiple cores for a given solr web app (through the CoreContainer
interface).

There are common admin handler commands with the actions - register / swap /
load etc. that seem to be available by default.

I have submitted a patch to add custom admin handlers , against custom
actions  ( that also refactors the existing action handlers that are
available by default as well ).

This would be useful to extend the handlers that need access to multiple
cores.  Just curious if this is something that could be looked into .
Thanks.


SOLR-1107 Lifecycle management for Solr

2009-04-13 Thread Kay Kay
In CoreContainer - before the instantiation of the individual SolrCore-s ,
we need to perform book-keeping / loading that could be used by the request
handlers and search components in the solrcores.
Currently there seems to be no way to provide the hook / lifecycle handler
to corecontainer to do the same.

Submitted a patch as mentioned above in the subject to do the same. Can
somebody help review it.


Re: SOLR-1106 - Custom Admin Action handler

2009-04-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Kay,

The idea of one handler per command looks like an overkill. How about
having a protected methods for all the known commands and have a
separate method invokeCommand() which can choose to implement any
extra commands if need be. This way the changes needed would be
minimal.

On Mon, Apr 13, 2009 at 8:53 PM, Kay Kay  wrote:
> For one of our projects - we need custom admin monitoring hooks that gets
> access to multiple cores for a given solr web app (through the CoreContainer
> interface).
>
> There are common admin handler commands with the actions - register / swap /
> load etc. that seem to be available by default.
>
> I have submitted a patch to add custom admin handlers , against custom
> actions  ( that also refactors the existing action handlers that are
> available by default as well ).
>
> This would be useful to extend the handlers that need access to multiple
> cores.  Just curious if this is something that could be looked into .
> Thanks.
>



-- 
--Noble Paul


Re: SOLR-1106 - Custom Admin Action handler

2009-04-13 Thread Kay Kay
These custom action handlers need not be residing in solr . Hence I needed a
hook ( listener ) that they can register themselves with and be loaded by
the SolrResourceLoader ( ./lib/*.jar ) .  Also I believe the default
handlers are very useful , necessary and mandatory and hence ported them to
the listener for consistency purposes.

Also - if we have a protected method called invokeCommand() - how do we
inject that type as the admin handler ( as opposed to CoreAdminHandler) .
Right now - the type information seems hardcoded in CoreContainer though.

  //  Multicore self related methods ---
  /**
   * Creates a CoreAdminHandler for this MultiCore.
   * @return a CoreAdminHandler
   */
  protected CoreAdminHandler createMultiCoreHandler() {
return new CoreAdminHandler() {
  @Override
  public CoreContainer getCoreContainer() {
return CoreContainer.this;
  }
};
  }


2009/4/13 Noble Paul നോബിള്‍ नोब्ळ् 

> Hi Kay,
>
> The idea of one handler per command looks like an overkill. How about
> having a protected methods for all the known commands and have a
> separate method invokeCommand() which can choose to implement any
> extra commands if need be. This way the changes needed would be
> minimal.
>
> On Mon, Apr 13, 2009 at 8:53 PM, Kay Kay  wrote:
> > For one of our projects - we need custom admin monitoring hooks that gets
> > access to multiple cores for a given solr web app (through the
> CoreContainer
> > interface).
> >
> > There are common admin handler commands with the actions - register /
> swap /
> > load etc. that seem to be available by default.
> >
> > I have submitted a patch to add custom admin handlers , against custom
> > actions  ( that also refactors the existing action handlers that are
> > available by default as well ).
> >
> > This would be useful to extend the handlers that need access to multiple
> > cores.  Just curious if this is something that could be looked into .
> > Thanks.
> >
>
>
>
> --
> --Noble Paul
>


[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-04-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698429#action_12698429
 ] 

Grant Ingersoll commented on SOLR-773:
--

This latest patch doesn't compile b/c it is missing the SpatialParams class.

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



parsing bool type in solrconfig.xml

2009-04-13 Thread Koji Sekiguchi
I wan't aware of this so far...

In solrconfig.xml, if I want to set hl=on by default, on or
true will result what I want. But if I set on,
hl will be false. If set true, hl will be true.
When posting a search request, any one of hl=true|on|yes are all fine.

The reason is because the arguments in solrconfig.xml are parsed by
Boolean.valueOf() in DOMUtil:

:
} else if ("bool".equals(type)) {
val = Boolean.valueOf(getText(nd));
:

On the other hand, at the requesting time, HighlightComponent takes hl
parameter as string type
and parses by parseBool().

Should we accept not only true, but also on
and yes?
I think it is easy by using parseBool() instead of Boolean.valueOf() in
DOMUtil.

Thoughts?

Koji



Make ant example faster

2009-04-13 Thread Shalin Shekhar Mangar
Hello,

As part of SOLR-934, I'd like to setup an example for indexing mail boxes
with the existing example/example-DIH demo. I see that ant example has a
dependency on example-contrib. Do we want to do that? I vaguely remember
Yonik complaining about the time ant example takes.

For setting up the MailEntityProcessor, I'd have to copy mail, activation
and tika jars to example-DIH/solr/mail/lib, which will make it extra slow.
How about we remove the dependency to example-contrib and keep it as an
independent target?

-- 
Regards,
Shalin Shekhar Mangar.


[jira] Updated: (SOLR-773) Incorporate Local Lucene/Solr

2009-04-13 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-773:
---

Attachment: SOLR-773-local-lucene.patch

dooh -- here is a patch that includes SpatialParams

I just ran 'svn up' and 'ant test', a bunch of solrj things fail -- i can't 
look into just now, but I'll post anyway.

- - - - -

Note, this patch has a bunch of weirdness to try to avoid a memory error with 
custom sorting in lucene.  The new field options in LUCENE-1483 should avoid 
this problem, but LocalLucene must be refactored to use the new sorting classes 
first.

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)
EmbeddedSolrServer is not being packaged with Solrj jar
---

 Key: SOLR-1112
 URL: https://issues.apache.org/jira/browse/SOLR-1112
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Reporter: Shalin Shekhar Mangar
 Fix For: 1.4


EmbeddedSolrServer is not being packaged with Solrj jar

http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: EmbeddedSolrServer not in solrj jar?

2009-04-13 Thread Shalin Shekhar Mangar
On Mon, Apr 6, 2009 at 3:44 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> I was wondering why EmbeddedSolrServer is in src/webapps/src rather than in
> src/solrj? I think it accidentally got moved in there when solrj was brought
> into 'src' from the 'client' directory? I also noticed that
> EmbeddedSolrServer is not being packaged into the solrj jar when ant dist is
> run.
>

OK. I opened SOLR-1112 to track this.

-- 
Regards,
Shalin Shekhar Mangar.


[jira] Commented: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698472#action_12698472
 ] 

Ryan McKinley commented on SOLR-1112:
-

EmbeddedSolrServer can not be in the solrj package since it would require the 
rest of solr to also be installed.

The solrj jar is limited to stuff that is just needed for the client.

To run embedded solr, you need client+server.



> EmbeddedSolrServer is not being packaged with Solrj jar
> ---
>
> Key: SOLR-1112
> URL: https://issues.apache.org/jira/browse/SOLR-1112
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> EmbeddedSolrServer is not being packaged with Solrj jar
> http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Contributing Translations

2009-04-13 Thread Shalin Shekhar Mangar
On Fri, Apr 10, 2009 at 11:05 PM, Green Crescent Translations <
ad...@greentranslations.com> wrote:

>
> Sure, we're actually using Solr for a project we're working on and thought
> it would be nice to lend a hand.  Great software.  Do you have any current
> translations or would this be something totally new?  I'd guess the place to
> start would be Spanish or maybe Russian.  Chinese, Japanese and French are
> also big.  I'd guess Solr can generate a whole new audience if translated
> into the right languages.
>
> Do you have an essential text that we could take a look at to start with?
>  Or maybe we could start with the website?
>
>
The website is a good start, especially the Solr tutorial page. Then there
are a couple of important wiki pages that would be great. For example,

   1. Tomcat installation page - http://wiki.apache.org/solr/SolrTomcat
   2. Jetty - http://wiki.apache.org/solr/SolrJetty
   3. Adding documents in XML -
   http://wiki.apache.org/solr/UpdateXmlMessages
   4. DataImportHandler (this is a huge one, not everything may be needed)
   -- http://wiki.apache.org/solr/DataImportHandler
   5. Common Query parameters -
   http://wiki.apache.org/solr/CommonQueryParameters
   6. Simple facet params -
   http://wiki.apache.org/solr/SimpleFacetParameters
   7. Using the Solrj client - http://wiki.apache.org/solr/Solrj

We can also convert the comments in the example solrconfig.xml and
schema.xml to other languages as they contain very comprehensive and perhaps
the most up-to-date documentation.

I know this is a lot so please do not feel like everything here is needed :)

We should be able to pare things down quite a bit. This is also a good
opportunity to organize the documentation better and expand the tutorial to
make it more comprehensive. If we can do that, translating that one tutorial
should be good enough.

-- 
Regards,
Shalin Shekhar Mangar.


[jira] Commented: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698477#action_12698477
 ] 

Shalin Shekhar Mangar commented on SOLR-1112:
-

bq. The solrj jar is limited to stuff that is just needed for the client.

Makes sense. But isn't this something that was changed recently? I do remember 
the 1.3 solrj jar containing this class. Although, one had to add the core jar 
to make it work.

> EmbeddedSolrServer is not being packaged with Solrj jar
> ---
>
> Key: SOLR-1112
> URL: https://issues.apache.org/jira/browse/SOLR-1112
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> EmbeddedSolrServer is not being packaged with Solrj jar
> http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-04-13 Thread patrick o'leary (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698496#action_12698496
 ] 

patrick o'leary commented on SOLR-773:
--

Thanks Ryan, I've also updated local / spatial lucene to use the new 
FieldComparatorSource with LUCENE-1588
But haven't had a chance to test it in Solr yet 

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-804) include lucene misc jar in solr distro

2009-04-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-804.
--

Resolution: Fixed

Committed revision 764580.

Added lucene-misc-2.9-dev.jar from rev 764281 which should match the Lucene 
version on trunk.

> include lucene misc jar in solr distro
> --
>
> Key: SOLR-804
> URL: https://issues.apache.org/jira/browse/SOLR-804
> Project: Solr
>  Issue Type: Wish
>Affects Versions: 1.3
> Environment: all
>Reporter: solrize
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.4
>
>
> It would be useful to have the lucene misc jar file included with solr.  My 
> immediate goal is to build several solr indexes in parallel on separate 
> servers, then run the index merge utility at the end to combine them into a 
> single index.  Erik H suggested I post an issue requesting including the misc 
> jar with solr.  Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Make ant example faster

2009-04-13 Thread Shalin Shekhar Mangar
On Tue, Apr 14, 2009 at 12:33 AM, Grant Ingersoll wrote:

>
> Instead of a kitchen-sink example directory, we "revert" it back to being
> the tutorial example.  It still can get built by ant example, but ultimately
> we "deprecate" it (more later).
>
> Then, as a replacement, we create a directory containing what I would call
> Solr Templates, which contain subdirectories named appropriately for the
> kind of example.  Rather than explain, I'll give an example:
>
> The templates directory would contain the configurations (i.e. schema.xml
> and solrconfig.xml) and any sample docs (but not the libraries) for:
>tutorial - The current tutorial example
>dih - The DIH example
>extraction - Solr Cell example
>geo - geo spatial example (once 773 is committed)
>clustering - once SOLR-769 is committed
>simple - A barebones schema and config (mainly used for
> bootstrapping a new project for experienced users)
>exploratory - Basically, the same as simple, but the schema defines
> a single dynamic field -  Think of Hoss's Solr Out of the Box talk from
> ApacheCon whereby you want to quickly explore a new data set without having
> to define a schema.
>[other] -
>
> Note, the templates directory could also live under each contrib, but it
> isn't necessarily a 1-1 thing (e.g. simple and exploratory templates are not
> contrib-specific).
>
> Then, typing "ant example" would copy the necessary tutorial stuff to the
> example directory (which still contains the Jetty stuff) but would not have
> to recurse into any of the contribs.
>
> Typing "ant example -Dtype=clustering"  would copy the clustering
> requirements, plus go to contrib/clustering (or whatever) and get the
> appropriate material such that the example directory.  Similarly for any of
> the other "templates"
>

Isn't this the same as the current setup with the name of the directory
changed and different ant targets to set them up? The new ant target will
setup the default solr instance to be 'extraction' or 'dih' or 'clustering'
and avoid the need to type -Dsolr.solr.home.


>
> Additionally, you could also define -DoutputDir such that it would take and
> copy the whole example directory (including the appropriate type) to some
> output dir.  This would allow one to quickly bootstrap a Solr project
> without having to do a lot of schema editing.
>

I like this idea. I have myself needed to do this a couple of times.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Contributing Translations

2009-04-13 Thread Green Crescent Translations

Hi Shalin,

This looks pretty doable actually.  We'll do them all for you.  Would 
you like us to submit the page to you in HTML or would you like it in 
MS Word or Excel so that you can input the translations 
yourselves?  Either way is fine for us.


In fact, if you wouldn't mind putting a little acknowledgement at the 
bottom of all that pages that we translate, we'll do these pages in 
numerous languages for you.  How does that sound?  We can start off 
in Spanish, Russian, Chinese, and Japanese if that sounds good.


What I had in mind is something like this (for the Spanish 
translation, for example)...


last edited 
2009-03-01 16:36:09 by title="c-98-221-147-223.hsd1.nj.comcast.net">YonikSeeley


href="http://www.greentranslations.com/spanish-translation";>Spanish 
translation by Green Crescent Translations.




 

Hopefully, this acknowledgement over time will help us recoup the 
cost of doing the translations by generating some paid work for us so 
I think this could be a win-win.


Let me know if that's ok for you and we'll get started on it.

Thanks,

Jonathan




At 01:23 PM 4/13/2009, you wrote:

On Fri, Apr 10, 2009 at 11:05 PM, Green Crescent Translations <
ad...@greentranslations.com> wrote:

>
> Sure, we're actually using Solr for a project we're working on and thought
> it would be nice to lend a hand.  Great software.  Do you have any current
> translations or would this be something totally new?  I'd guess 
the place to

> start would be Spanish or maybe Russian.  Chinese, Japanese and French are
> also big.  I'd guess Solr can generate a whole new audience if translated
> into the right languages.
>
> Do you have an essential text that we could take a look at to start with?
>  Or maybe we could start with the website?
>
>
The website is a good start, especially the Solr tutorial page. Then there
are a couple of important wiki pages that would be great. For example,

   1. Tomcat installation page - http://wiki.apache.org/solr/SolrTomcat
   2. Jetty - http://wiki.apache.org/solr/SolrJetty
   3. Adding documents in XML -
   http://wiki.apache.org/solr/UpdateXmlMessages
   4. DataImportHandler (this is a huge one, not everything may be needed)
   -- http://wiki.apache.org/solr/DataImportHandler
   5. Common Query parameters -
   http://wiki.apache.org/solr/CommonQueryParameters
   6. Simple facet params -
   http://wiki.apache.org/solr/SimpleFacetParameters
   7. Using the Solrj client - http://wiki.apache.org/solr/Solrj

We can also convert the comments in the example solrconfig.xml and
schema.xml to other languages as they contain very comprehensive and perhaps
the most up-to-date documentation.

I know this is a lot so please do not feel like everything here is needed :)

We should be able to pare things down quite a bit. This is also a good
opportunity to organize the documentation better and expand the tutorial to
make it more comprehensive. If we can do that, translating that one tutorial
should be good enough.

--
Regards,
Shalin Shekhar Mangar.


Jonathan W. Fabian
Project Manager
(888) 963 0006 - U.S. Toll Free
(702) 966 5160 - International
(702) 966 5130 - Fax
www.greencrescent.com


Re: Contributing Translations

2009-04-13 Thread Grant Ingersoll
First off, let me say that I would love to see translations of Solr  
docs.


My main concern is one of maintainability.  If we agree to commit  
translations, then we as committers need to be able to maintain them  
as well.  I am not sure which is worse, no translations or out of date  
translations.


Say, for example, that I make a patch that changes how the spell  
checker works in Solr.  As an English speaker, I can easily update the  
English docs as part of my patch, but I wouldn't even know where to  
begin with, say, Swahili (picking a language I feel safe saying that  
none of our committers speak for an example, not b/c anyone is  
proposing a Swahili translation).  So, now, it is up to the community  
to fix that documentation.  Which, is, of course, fine, except I'd  
venture to say most committers wouldn't even be in the position to  
know whether the patch is good, so we'd have to take it on faith.   
Committing on faith isn't usually a good thing.


We should look into how other Apache projects handle it before  
committing to saying we are going to support other languages.  I can  
ask over on commun...@apache.org if people would like.


On Apr 9, 2009, at 10:40 PM, Green Crescent Translations wrote:


Hello,

I'm a project manager for Green Crescent Translations and I'm always  
looking to assist the open source community by providing  
translations of web sites, manuals, user interfaces and such.  If  
you're interested, please let us know.  We'd be happy to translate  
you web site documentation into needed languages.  Just let me know  
which languages and what texts are essential and we'd be happy to  
help.


Many thanks,

Jonathan








[jira] Updated: (SOLR-804) include lucene misc jar in solr distro

2009-04-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-804:
-

Fix Version/s: (was: 1.5)
   1.4

> include lucene misc jar in solr distro
> --
>
> Key: SOLR-804
> URL: https://issues.apache.org/jira/browse/SOLR-804
> Project: Solr
>  Issue Type: Wish
>Affects Versions: 1.3
> Environment: all
>Reporter: solrize
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.4
>
>
> It would be useful to have the lucene misc jar file included with solr.  My 
> immediate goal is to build several solr indexes in parallel on separate 
> servers, then run the index merge utility at the end to combine them into a 
> single index.  Erik H suggested I post an issue requesting including the misc 
> jar with solr.  Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Make ant example faster

2009-04-13 Thread Grant Ingersoll
Funny you should mention it, b/c I had an idea the other day of how to  
speed all this up, plus will satisfy one of my other annoyances with  
the example and make it easier for people to get started (I think).   
So, here goes:


Instead of a kitchen-sink example directory, we "revert" it back to  
being the tutorial example.  It still can get built by ant example,  
but ultimately we "deprecate" it (more later).


Then, as a replacement, we create a directory containing what I would  
call Solr Templates, which contain subdirectories named appropriately  
for the kind of example.  Rather than explain, I'll give an example:


The templates directory would contain the configurations (i.e.  
schema.xml and solrconfig.xml) and any sample docs (but not the  
libraries) for:

tutorial - The current tutorial example
dih - The DIH example
extraction - Solr Cell example
geo - geo spatial example (once 773 is committed)
clustering - once SOLR-769 is committed
	simple - A barebones schema and config (mainly used for bootstrapping  
a new project for experienced users)
	exploratory - Basically, the same as simple, but the schema defines a  
single dynamic field -  Think of Hoss's Solr Out of the Box talk from  
ApacheCon whereby you want to quickly explore a new data set without  
having to define a schema.

[other] -

Note, the templates directory could also live under each contrib, but  
it isn't necessarily a 1-1 thing (e.g. simple and exploratory  
templates are not contrib-specific).


Then, typing "ant example" would copy the necessary tutorial stuff to  
the example directory (which still contains the Jetty stuff) but would  
not have to recurse into any of the contribs.


Typing "ant example -Dtype=clustering"  would copy the clustering  
requirements, plus go to contrib/clustering (or whatever) and get the  
appropriate material such that the example directory.  Similarly for  
any of the other "templates"


Additionally, you could also define -DoutputDir such that it would  
take and copy the whole example directory (including the appropriate  
type) to some output dir.  This would allow one to quickly bootstrap a  
Solr project without having to do a lot of schema editing.


WDYT?

-Grant




On Apr 13, 2009, at 1:56 PM, Shalin Shekhar Mangar wrote:


Hello,

As part of SOLR-934, I'd like to setup an example for indexing mail  
boxes
with the existing example/example-DIH demo. I see that ant example  
has a
dependency on example-contrib. Do we want to do that? I vaguely  
remember

Yonik complaining about the time ant example takes.

For setting up the MailEntityProcessor, I'd have to copy mail,  
activation
and tika jars to example-DIH/solr/mail/lib, which will make it extra  
slow.
How about we remove the dependency to example-contrib and keep it as  
an

independent target?

--
Regards,
Shalin Shekhar Mangar.





[jira] Updated: (SOLR-934) Enable importing of mails into a solr index through DIH.

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-934:
---

Attachment: SOLR-934.patch

Updated NOTICE.txt and LICENSE.txt with the license information given at the 
following:
* 
http://repo2.maven.org/maven2/javax/activation/activation/1.1/activation-1.1.pom
* http://repo2.maven.org/maven2/javax/mail/mail/1.4.1/mail-1.4.1.pom

I'll commit this shortly.

> Enable importing of mails into a solr index through DIH.
> 
>
> Key: SOLR-934
> URL: https://issues.apache.org/jira/browse/SOLR-934
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Preetam Rao
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-934.patch, SOLR-934.patch, SOLR-934.patch, 
> SOLR-934.patch, SOLR-934.patch, SOLR-934.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Enable importing of mails into solr through DIH. Take one or more mailbox 
> credentials, download and index their content along with the content from 
> attachments. The folders to fetch can be made configurable based on various 
> criteria. Apache Tika is used for extracting content from different kinds of 
> attachments. JavaMail is used for mail box related operations like fetching 
> mails, filtering them etc.
> The basic configuration for one mail box is as below:
> {code:xml}
> 
> password="something" host="imap.gmail.com" protocol="imaps"/>
> 
> {code}
> The below is the list of all configuration available:
> {color:green}Required{color}
> -
> *user* 
> *pwd* 
> *protocol*  (only "imaps" supported now)
> *host* 
> {color:green}Optional{color}
> -
> *folders* - comma seperated list of folders. 
> If not specified, default folder is used. Nested folders can be specified 
> like a/b/c
> *recurse* - index subfolders. Defaults to true.
> *exclude* - comma seperated list of patterns. 
> *include* - comma seperated list of patterns.
> *batchSize* - mails to fetch at once in a given folder. 
> Only headers can be prefetched in Javamail IMAP.
> *readTimeout* - defaults to 6ms
> *conectTimeout* - defaults to 3ms
> *fetchSize* - IMAP config. 32KB default
> *fetchMailsSince* -
> date/time in "-MM-dd HH:mm:ss" format, mails received after which will be 
> fetched. Useful for delta import.
> *customFilter* - class name.  
> {code}
> import javax.mail.Folder;
> import javax.mail.SearchTerm;
> clz implements MailEntityProcessor.CustomFilter() {
> public SearchTerm getCustomSearch(Folder folder);
> }
> {code}
> *processAttachement* - defaults to true
> The below are the indexed fields.
> {code}
>   // Fields To Index
>   // single valued
>   private static final String SUBJECT = "subject";
>   private static final String FROM = "from";
>   private static final String SENT_DATE = "sentDate";
>   private static final String XMAILER = "xMailer";
>   // multi valued
>   private static final String TO_CC_BCC = "allTo";
>   private static final String FLAGS = "flags";
>   private static final String CONTENT = "content";
>   private static final String ATTACHMENT = "attachement";
>   private static final String ATTACHMENT_NAMES = "attachementNames";
>   // flag values
>   private static final String FLAG_ANSWERED = "answered";
>   private static final String FLAG_DELETED = "deleted";
>   private static final String FLAG_DRAFT = "draft";
>   private static final String FLAG_FLAGGED = "flagged";
>   private static final String FLAG_RECENT = "recent";
>   private static final String FLAG_SEEN = "seen";
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-934) Enable importing of mails into a solr index through DIH.

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-934.


Resolution: Fixed

Committed revision 764601.

Thanks Preetam!

> Enable importing of mails into a solr index through DIH.
> 
>
> Key: SOLR-934
> URL: https://issues.apache.org/jira/browse/SOLR-934
> Project: Solr
>  Issue Type: New Feature
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Preetam Rao
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: SOLR-934.patch, SOLR-934.patch, SOLR-934.patch, 
> SOLR-934.patch, SOLR-934.patch, SOLR-934.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Enable importing of mails into solr through DIH. Take one or more mailbox 
> credentials, download and index their content along with the content from 
> attachments. The folders to fetch can be made configurable based on various 
> criteria. Apache Tika is used for extracting content from different kinds of 
> attachments. JavaMail is used for mail box related operations like fetching 
> mails, filtering them etc.
> The basic configuration for one mail box is as below:
> {code:xml}
> 
> password="something" host="imap.gmail.com" protocol="imaps"/>
> 
> {code}
> The below is the list of all configuration available:
> {color:green}Required{color}
> -
> *user* 
> *pwd* 
> *protocol*  (only "imaps" supported now)
> *host* 
> {color:green}Optional{color}
> -
> *folders* - comma seperated list of folders. 
> If not specified, default folder is used. Nested folders can be specified 
> like a/b/c
> *recurse* - index subfolders. Defaults to true.
> *exclude* - comma seperated list of patterns. 
> *include* - comma seperated list of patterns.
> *batchSize* - mails to fetch at once in a given folder. 
> Only headers can be prefetched in Javamail IMAP.
> *readTimeout* - defaults to 6ms
> *conectTimeout* - defaults to 3ms
> *fetchSize* - IMAP config. 32KB default
> *fetchMailsSince* -
> date/time in "-MM-dd HH:mm:ss" format, mails received after which will be 
> fetched. Useful for delta import.
> *customFilter* - class name.  
> {code}
> import javax.mail.Folder;
> import javax.mail.SearchTerm;
> clz implements MailEntityProcessor.CustomFilter() {
> public SearchTerm getCustomSearch(Folder folder);
> }
> {code}
> *processAttachement* - defaults to true
> The below are the indexed fields.
> {code}
>   // Fields To Index
>   // single valued
>   private static final String SUBJECT = "subject";
>   private static final String FROM = "from";
>   private static final String SENT_DATE = "sentDate";
>   private static final String XMAILER = "xMailer";
>   // multi valued
>   private static final String TO_CC_BCC = "allTo";
>   private static final String FLAGS = "flags";
>   private static final String CONTENT = "content";
>   private static final String ATTACHMENT = "attachement";
>   private static final String ATTACHMENT_NAMES = "attachementNames";
>   // flag values
>   private static final String FLAG_ANSWERED = "answered";
>   private static final String FLAG_DELETED = "deleted";
>   private static final String FLAG_DRAFT = "draft";
>   private static final String FLAG_FLAGGED = "flagged";
>   private static final String FLAG_RECENT = "recent";
>   private static final String FLAG_SEEN = "seen";
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-1112.
-

Resolution: Won't Fix

> EmbeddedSolrServer is not being packaged with Solrj jar
> ---
>
> Key: SOLR-1112
> URL: https://issues.apache.org/jira/browse/SOLR-1112
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> EmbeddedSolrServer is not being packaged with Solrj jar
> http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698549#action_12698549
 ] 

Ryan McKinley commented on SOLR-1112:
-

In 1.3 I think it is in the solrj jar file.  We moved it out when we tried to 
clean up the dependency graph so the code/jars reflect the real dependencies.  
(1.3 has a pretty strange dependency graph if you look carefully!)

I'll mark this issue as closed -- unless you think it is a problem



> EmbeddedSolrServer is not being packaged with Solrj jar
> ---
>
> Key: SOLR-1112
> URL: https://issues.apache.org/jira/browse/SOLR-1112
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> EmbeddedSolrServer is not being packaged with Solrj jar
> http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1111) fix FieldCache usage in Solr

2009-04-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698399#action_12698399
 ] 

Yonik Seeley edited comment on SOLR- at 4/13/09 2:50 PM:
-

The major issue is that Lucene now creates scorers per-segment, and if you use 
Lucene's searcher.search(...,sort) then the FieldCache populations will also be 
per-segment.

The biggest issue:  If FieldCache get's populated at both the top-level reader 
and per-segment, memory usage doubles (as does un-inversion time).
 - Faceting on single-valued fields uses the FieldCache at the top-level (and 
would be
   - This is non-trivial to change...  if we started counting per-segment, 
counts would somehow have to be merged across segments.
 - Sorting in Solr currently uses the FieldCache at the top level
   - This can't easily be changed to use Lucene's searcher.search(...,sort) 
since we are using a hit collector (which can be wrapped in a time limited 
collector).
 - Distributed search uses the top-level FieldCache to retrieve sort field 
values.
 - FunctionQuery now derives values at the segment level
   - This also applies to the function range query

Another issue for function query is the use of ord()... it won't be valid in 
multi-segment indexes if evaluated at the segment level.

  was (Author: ysee...@gmail.com):
The major issue is that Lucene now creates scorers per-segment, and if you 
use Lucene's searcher.search(...,sort) then the FieldCache populations will 
also be per-segment.

The biggest issue:  If FieldCache get's populated at both the top-level reader 
and per-segment, memory usage doubles (as does un-inversion time).
 - Faceting on single-valued fields uses the FieldCache at the top-level (and 
would be
   - This is non-trivial to change...  if we started counting per-segment, 
counts would somehow have to be merged across segments.
 - Sorting in Solr currently uses the FieldCache at the top level
   - This can't easily be changed to use Lucene's searcher.search(...,sort) 
since we are using a hit collector (which can be wrapped in a time limited 
collector).
 - Distributed search uses the top-level FieldCache to retrieve sort field 
values.
 - FunctionQuery now derives values at the segment level
   - This also applies to the function range query
  
> fix FieldCache usage in Solr
> 
>
> Key: SOLR-
> URL: https://issues.apache.org/jira/browse/SOLR-
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
> Fix For: 1.4
>
>
> Recent changes in Lucene have altered how the FieldCache is used and as-is 
> could lead to previously working Solr installations blowing up when they 
> upgrade to 1.4.  We need to fix, or document the affects of these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Make ant example faster

2009-04-13 Thread Grant Ingersoll


On Apr 13, 2009, at 3:44 PM, Shalin Shekhar Mangar wrote:

On Tue, Apr 14, 2009 at 12:33 AM, Grant Ingersoll  
wrote:




Instead of a kitchen-sink example directory, we "revert" it back to  
being
the tutorial example.  It still can get built by ant example, but  
ultimately

we "deprecate" it (more later).

Then, as a replacement, we create a directory containing what I  
would call
Solr Templates, which contain subdirectories named appropriately  
for the

kind of example.  Rather than explain, I'll give an example:

The templates directory would contain the configurations (i.e.  
schema.xml

and solrconfig.xml) and any sample docs (but not the libraries) for:
  tutorial - The current tutorial example
  dih - The DIH example
  extraction - Solr Cell example
  geo - geo spatial example (once 773 is committed)
  clustering - once SOLR-769 is committed
  simple - A barebones schema and config (mainly used for
bootstrapping a new project for experienced users)
  exploratory - Basically, the same as simple, but the schema  
defines
a single dynamic field -  Think of Hoss's Solr Out of the Box talk  
from
ApacheCon whereby you want to quickly explore a new data set  
without having

to define a schema.
  [other] -

Note, the templates directory could also live under each contrib,  
but it
isn't necessarily a 1-1 thing (e.g. simple and exploratory  
templates are not

contrib-specific).

Then, typing "ant example" would copy the necessary tutorial stuff  
to the
example directory (which still contains the Jetty stuff) but would  
not have

to recurse into any of the contribs.

Typing "ant example -Dtype=clustering"  would copy the clustering
requirements, plus go to contrib/clustering (or whatever) and get the
appropriate material such that the example directory.  Similarly  
for any of

the other "templates"



Isn't this the same as the current setup with the name of the  
directory
changed and different ant targets to set them up? The new ant target  
will
setup the default solr instance to be 'extraction' or 'dih' or  
'clustering'

and avoid the need to type -Dsolr.solr.home.



It is similar, indeed, but I think it results in there only ever being  
one active Solr example and the user need not worry about setting solr  
home.


-Grant


[jira] Commented: (SOLR-1112) EmbeddedSolrServer is not being packaged with Solrj jar

2009-04-13 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698639#action_12698639
 ] 

Shalin Shekhar Mangar commented on SOLR-1112:
-

Sure, thanks for the clarification, Ryan.

> EmbeddedSolrServer is not being packaged with Solrj jar
> ---
>
> Key: SOLR-1112
> URL: https://issues.apache.org/jira/browse/SOLR-1112
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> EmbeddedSolrServer is not being packaged with Solrj jar
> http://markmail.org/thread/6wbthe5w2a2nwlc6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: parsing bool type in solrconfig.xml

2009-04-13 Thread Shalin Shekhar Mangar
2009/4/13 Koji Sekiguchi 

> Should we accept not only true, but also on
> and yes?
> I think it is easy by using parseBool() instead of Boolean.valueOf() in
> DOMUtil.
>

+1

I know it is inconsistent but so are the request parameters like hl,
debugQuery etc which I doubt will be changed.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Make ant example faster

2009-04-13 Thread Shalin Shekhar Mangar
On Tue, Apr 14, 2009 at 4:25 AM, Grant Ingersoll wrote:

>
> On Apr 13, 2009, at 3:44 PM, Shalin Shekhar Mangar wrote:
>
>>
>> Isn't this the same as the current setup with the name of the directory
>> changed and different ant targets to set them up? The new ant target will
>> setup the default solr instance to be 'extraction' or 'dih' or
>> 'clustering'
>> and avoid the need to type -Dsolr.solr.home.
>>
>
>
> It is similar, indeed, but I think it results in there only ever being one
> active Solr example and the user need not worry about setting solr home.
>

+1

Lets do it.

-- 
Regards,
Shalin Shekhar Mangar.