[jira] Commented: (SOLR-705) Distributed search should optionally return docID-shard map

2009-08-09 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741046#action_12741046 ] Noble Paul commented on SOLR-705: - Let us have a special field of something like _meta_ (to

structure preservation for solr

2009-08-09 Thread swamynathan
hi, im swamynathan a computer science engineering studying in jaya engg college which is under anna univercity,chennai,India as a part of my curriculam in the final year i need to do a proj. i spoke with some solr users and programmers and found out that all content that are indexed to it are

[jira] Created: (SOLR-1352) DIH: MultiThreaded

2009-08-09 Thread Noble Paul (JIRA)
DIH: MultiThreaded -- Key: SOLR-1352 URL: https://issues.apache.org/jira/browse/SOLR-1352 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Noble Paul

Re: structure preservation for solr

2009-08-09 Thread Avlesh Singh
First of all, why this is a solr-dev question? Second it seems you have some strong misconceptions regarding what Solr is, and what it is supposed to do. Read up a bit on Solr wiki and text analyzers. If you still have questions, ask them on solr-user mailing list. Cheers Avlesh On Sun, Aug 9,

[jira] Commented: (SOLR-1352) DIH: MultiThreaded

2009-08-09 Thread Avlesh Singh (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741047#action_12741047 ] Avlesh Singh commented on SOLR-1352: Thanks, once again :), for creating the ticket,

Re: structure preservation for solr

2009-08-09 Thread swamynathan
ok On Sun, Aug 9, 2009 at 1:22 PM, Avlesh Singh avl...@gmail.com wrote: First of all, why this is a solr-dev question? Second it seems you have some strong misconceptions regarding what Solr is, and what it is supposed to do. Read up a bit on Solr wiki and text analyzers. If you still have

[jira] Commented: (SOLR-1348) JdbcDataSource does not import Blob values correctly by default

2009-08-09 Thread Avlesh Singh (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741050#action_12741050 ] Avlesh Singh commented on SOLR-1348: I have used cast() function in MySQL to convert my

Re: solr 1.4 schedule

2009-08-09 Thread Grant Ingersoll
There is also a whole lot unassigned still, we probably should at least give them a glance to see if there is anything that slipped that should be in 1.4 On Aug 8, 2009, at 6:34 PM, Yonik Seeley wrote: 24 open issues left - and nothing too difficult left. It's looking like we should be to

Re: structure preservation for solr

2009-08-09 Thread Grant Ingersoll
On Aug 9, 2009, at 3:46 AM, swamynathan wrote: hi, im swamynathan a computer science engineering studying in jaya engg college which is under anna univercity,chennai,India as a part of my curriculam in the final year i need to do a proj. i spoke with some solr users and programmers and

[jira] Updated: (SOLR-1335) load core properties from a properties file

2009-08-09 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1335: - Attachment: SOLR-1335.patch with a testcase load core properties from a properties file

indexing slowdown with latest lucene udpate

2009-08-09 Thread Yonik Seeley
I did some quick indexing performance tests right before and right after the last lucene jar update - the results are not good... about 30% slower. The test was an 80 MB text field, 100K documents, 6 short text fields per document, with the solrconfig/schema from trunk copied to both environments.

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Grant Ingersoll
On Aug 9, 2009, at 10:29 AM, Yonik Seeley wrote: I did some quick indexing performance tests right before and right after the last lucene jar update - the results are not good... about 30% slower. The test was an 80 MB text field, 100K documents, 6 short text fields per document, with the

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Yonik Seeley
On Sun, Aug 9, 2009 at 12:01 PM, Grant Ingersollgsing...@apache.org wrote: Or bite the bullet and upgrade to the incrementToken() method. Right - I'm not sure if that would fix it or not - I haven't been involved in the new Token attribute stuff... I'm currently writing a basic indexing unit

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Yonik Seeley
OK, I've isolated (magnified) the effect with a test I just checked in. Indexing documents directly at the UpdateHandler was 85% faster before the latest lucene update. Run the test like this: ant test -Dtestcase=TestIndexingPerformance -Dargs=-server -Diter=10; grep throughput

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Yonik Seeley
It looks like implementing the new attribute stuff will not be enough - the token architecture has changed enough that it looks like we must cache tokenstreams to get back to good performance. -Yonik http://www.lucidimagination.com On Sun, Aug 9, 2009 at 12:57 PM, Yonik

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Yonik Seeley
FYI https://issues.apache.org/jira/browse/SOLR-1353 On Sun, Aug 9, 2009 at 2:02 PM, Yonik Seeleyyo...@lucidimagination.com wrote: It looks like implementing the new attribute stuff will not be enough - the token architecture has changed enough that it looks like we must cache tokenstreams to

[jira] Created: (SOLR-1353) implement reusable token streams for all Solr tokenizers / token filters

2009-08-09 Thread Yonik Seeley (JIRA)
implement reusable token streams for all Solr tokenizers / token filters Key: SOLR-1353 URL: https://issues.apache.org/jira/browse/SOLR-1353 Project: Solr Issue Type:

[jira] Commented: (SOLR-1353) implement reusable token streams for all Solr tokenizers / token filters

2009-08-09 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741134#action_12741134 ] Robert Muir commented on SOLR-1353: --- Yonik, at least in the case of analyzer class=xxx, I

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Michael Busch
Are you sure that the initialization costs of the TokenStream/ AttributeSource cause the slowdown? With the bw-comp. code now every call of a Token method goes through a delegation layer. I'm afraid that might cause a slowdown? The code that figures out what Attributes to put into the map

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Robert Muir
I am concerned about this one as well. Especially since the majority of the language analyzers in lucene-contrib do not implement reusableTokenStream. On Sun, Aug 9, 2009 at 5:06 PM, Michael Buschbusch...@gmail.com wrote: Are you sure that the initialization costs of the

Severe QTime issues to do with updates

2009-08-09 Thread Kaktu Chakarabati
Hey, I've recently noticed that there is a very large spike in the QTime for nodes serving queries, immediately after snappulling and snapinstalling. The numbers i'm seeing there are obviously some kind of lock-contention/concurrency issue, as I've monitored iostat/sar and its not a disk IO issue

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Mark Miller
Looks like there are a couple spots to blame, but mostly, TokenStream$isMethodOverloaded takes most of the blame. Appears very slow. - Mark

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Mark Miller
Mark Miller wrote: Looks like there are a couple spots to blame, but mostly, TokenStream$isMethodOverloaded takes most of the blame. Appears very slow. - Mark Or its just called too often the way Solr does things for how fast it is. Here are the profiling results: Before r801845

[jira] Created: (SOLR-1354) Experimental new feature: allow out-of-the-box apps by passing HTTP request parameters through to XSL scripts

2009-08-09 Thread Lance Norskog (JIRA)
Experimental new feature: allow out-of-the-box apps by passing HTTP request parameters through to XSL scripts - Key: SOLR-1354 URL:

[jira] Updated: (SOLR-1354) Experimental new feature: allow out-of-the-box apps by passing HTTP request parameters through to XSL scripts

2009-08-09 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lance Norskog updated SOLR-1354: Attachment: rss2.patch Experimental new feature: allow out-of-the-box apps by passing HTTP request

[jira] Commented: (SOLR-1354) Experimental new feature: allow out-of-the-box apps by passing HTTP request parameters through to XSL scripts

2009-08-09 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741157#action_12741157 ] Lance Norskog commented on SOLR-1354: - This experiment has two aims: 1) It should be

[jira] Updated: (SOLR-1354) Pass HTTP request parameters through to XSL scripts

2009-08-09 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lance Norskog updated SOLR-1354: Summary: Pass HTTP request parameters through to XSL scripts (was: Experimental new feature: allow

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Mark Miller
Michael Busch wrote: Are you sure that the initialization costs of the TokenStream/AttributeSource cause the slowdown? With the bw-comp. code now every call of a Token method goes through a delegation layer. I'm afraid that might cause a slowdown? Its isMethodOverriden and

Re: indexing slowdown with latest lucene udpate

2009-08-09 Thread Mark Miller
isMethodOverriden is just nasty - copying Methods, security checks, walking the type hierarchy, this, that, some more. I bet cglib has a really fast version - too bad there is no built in equivalent. Its not nearly as clean, but what if a new TokenStream simply identified itself as supporting