[jira] Issue Comment Edited: (SOLR-341) PHP Solr Client

2008-11-14 Thread Rich R (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647555#action_12647555
 ] 

richr edited comment on SOLR-341 at 11/14/08 1:45 AM:
---

Hi

I've encountered an issue in the _documentToXmlFragment() method of the 
Apache_Solr_Service class.

To cut a long story short, I'm building up documents from database rows, some 
of which contain NULL values.  I've noticed that NULL values interfere with the 
following (Iterator-based) code in the above method:


foreach ($document as $key => $value)
{
...


- what basically happens is that a NULL $value seems to cause the foreach loop 
to terminate prematurely.  Any fields 'beyond' that with the NULL value do not 
get added to the index.

The very simple workaround for this was to replace the above code fragment with 
the following:


$keys = $document->getFieldNames();
foreach ($keys as $key)
{
$value = $document->$key;
...


As you can see, it's essentially the same, and fully backwards-compatible.  It 
just avoids the issue I've been experiencing with the iterator.

I'd really like to see this change make it to the code if possible (I can 
submit a patch if necessary).

For reference, I'm using PHP 5.3 on OSX.

Best regards
Rich

  was (Author: richr):
Hi

I've encountered an issue in the _documentToXmlFragment() method of the 
Apache_Solr_Service class.

To cut a long story short, I'm building up documents from database rows, some 
of which contain NULL values.  I've noticed that NULL values interfere with the 
following (Iterator-based) code:


foreach ($document as $key => $value)
{
...


- what basically happens is that a NULL $value seems to cause the foreach loop 
to terminate prematurely.  Any fields 'beyond' that with the NULL value do not 
get added to the index.

The very simple workaround for this was to replace the above code fragment with 
the following:


$keys = $document->getFieldNames();
foreach ($keys as $key)
{
$value = $document->$key;
...


As you can see, it's essentially the same, and fully backwards-compatible.  It 
just avoids the issue I've been experiencing with the iterator.

I'd really like to see this change make it to the code if possible (I can 
submit a patch if necessary).

For reference, I'm using PHP 5.3 on OSX.

Best regards
Rich
  
> PHP Solr Client
> ---
>
> Key: SOLR-341
> URL: https://issues.apache.org/jira/browse/SOLR-341
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - php
>Affects Versions: 1.2
> Environment: PHP >= 5.2.0 (or older with JSON PECL extension or other 
> json_decode function implementation). Solr >= 1.2
>Reporter: Donovan Jimenez
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SolrPhpClient.2008-09-02.zip, SolrPhpClient.zip
>
>
> Developed this client when the example PHP source didn't meet our needs.  The 
> company I work for agreed to release it under the terms of the Apache License.
> This version is slightly different from what I originally linked to on the 
> dev mailing list.  I've incorporated feedback from Yonik and "hossman" to 
> simplify the client and only accept one response format (JSON currently).
> When Solr 1.3 is released the client can be updated to use the PHP or 
> Serialized PHP response writer.
> example usage from my original mailing list post:
>  require_once('Solr/Service.php');
> $start = microtime(true);
> $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
> 8180, '/solr');
> try
> {
> $response = $solr->search('solr', 0, 10,
> array(/* you can include other parameters here */));
> echo 'search returned with status = ', 
> $response->responseHeader->status,
> ' and took ', microtime(true) - $start, ' seconds', "\n";
> //here's how you would access results
> //Notice that I've mapped the values by name into a tree of stdClass 
> objects
> //and arrays (actually, most of this is done by json_decode )
> if ($response->response->numFound > 0)
> {
> $doc_number = $response->response->start;
> foreach ($response->response->docs as $doc)
> {
> $doc_number++;
> echo $doc_number, ': ', $doc->text, "\n";
> }
> }
> //for the purposes of seeing the available structure of the response
> //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
> the response before
> //any values are accessed may result in different behavior (in case
> //anyone has some troubles debugging)
> //print_r($response);
> }
> catch (Exception $e)
> {
> echo $e->getMessage(), "\n";
> }
> ?>

-- 
This message 

[jira] Commented: (SOLR-341) PHP Solr Client

2008-11-14 Thread Rich R (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647555#action_12647555
 ] 

Rich R commented on SOLR-341:
-

Hi

I've encountered an issue in the _documentToXmlFragment() method of the 
Apache_Solr_Service class.

To cut a long story short, I'm building up documents from database rows, some 
of which contain NULL values.  I've noticed that NULL values interfere with the 
following (Iterator-based) code:


foreach ($document as $key => $value)
{
...


- what basically happens is that a NULL $value seems to cause the foreach loop 
to terminate prematurely.  Any fields 'beyond' that with the NULL value do not 
get added to the index.

The very simple workaround for this was to replace the above code fragment with 
the following:


$keys = $document->getFieldNames();
foreach ($keys as $key)
{
$value = $document->$key;
...


As you can see, it's essentially the same, and fully backwards-compatible.  It 
just avoids the issue I've been experiencing with the iterator.

I'd really like to see this change make it to the code if possible (I can 
submit a patch if necessary).

For reference, I'm using PHP 5.3 on OSX.

Best regards
Rich

> PHP Solr Client
> ---
>
> Key: SOLR-341
> URL: https://issues.apache.org/jira/browse/SOLR-341
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - php
>Affects Versions: 1.2
> Environment: PHP >= 5.2.0 (or older with JSON PECL extension or other 
> json_decode function implementation). Solr >= 1.2
>Reporter: Donovan Jimenez
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SolrPhpClient.2008-09-02.zip, SolrPhpClient.zip
>
>
> Developed this client when the example PHP source didn't meet our needs.  The 
> company I work for agreed to release it under the terms of the Apache License.
> This version is slightly different from what I originally linked to on the 
> dev mailing list.  I've incorporated feedback from Yonik and "hossman" to 
> simplify the client and only accept one response format (JSON currently).
> When Solr 1.3 is released the client can be updated to use the PHP or 
> Serialized PHP response writer.
> example usage from my original mailing list post:
>  require_once('Solr/Service.php');
> $start = microtime(true);
> $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
> 8180, '/solr');
> try
> {
> $response = $solr->search('solr', 0, 10,
> array(/* you can include other parameters here */));
> echo 'search returned with status = ', 
> $response->responseHeader->status,
> ' and took ', microtime(true) - $start, ' seconds', "\n";
> //here's how you would access results
> //Notice that I've mapped the values by name into a tree of stdClass 
> objects
> //and arrays (actually, most of this is done by json_decode )
> if ($response->response->numFound > 0)
> {
> $doc_number = $response->response->start;
> foreach ($response->response->docs as $doc)
> {
> $doc_number++;
> echo $doc_number, ': ', $doc->text, "\n";
> }
> }
> //for the purposes of seeing the available structure of the response
> //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
> the response before
> //any values are accessed may result in different behavior (in case
> //anyone has some troubles debugging)
> //print_r($response);
> }
> catch (Exception $e)
> {
> echo $e->getMessage(), "\n";
> }
> ?>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-341) PHP Solr Client

2008-11-14 Thread Rich R (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647555#action_12647555
 ] 

richr edited comment on SOLR-341 at 11/14/08 1:49 AM:
---

Hi

I've encountered an issue in the _documentToXmlFragment() method of the 
Apache_Solr_Service class.

To cut a long story short, I'm building up documents from database rows, some 
of which contain NULL values.  I've noticed that NULL values interfere with the 
following (Iterator-based) code in the above method:


foreach ($document as $key => $value)
{
...


What basically happens is that a NULL $value seems to cause the foreach loop to 
terminate prematurely.  Any fields 'beyond' that with the NULL value do not get 
added to the index.

The very simple workaround for this was to replace the above code fragment with 
the following:


$keys = $document->getFieldNames();
foreach ($keys as $key)
{
$value = $document->$key;
...


As you can see, it's essentially the same, and fully backwards-compatible.  It 
just avoids the issue I've been experiencing with the iterator.

I'd really like to see this change make it to the code if possible (I can 
submit a patch if necessary).

For reference, I'm using PHP 5.3 on OSX.

Best regards
Rich

  was (Author: richr):
Hi

I've encountered an issue in the _documentToXmlFragment() method of the 
Apache_Solr_Service class.

To cut a long story short, I'm building up documents from database rows, some 
of which contain NULL values.  I've noticed that NULL values interfere with the 
following (Iterator-based) code in the above method:


foreach ($document as $key => $value)
{
...


- what basically happens is that a NULL $value seems to cause the foreach loop 
to terminate prematurely.  Any fields 'beyond' that with the NULL value do not 
get added to the index.

The very simple workaround for this was to replace the above code fragment with 
the following:


$keys = $document->getFieldNames();
foreach ($keys as $key)
{
$value = $document->$key;
...


As you can see, it's essentially the same, and fully backwards-compatible.  It 
just avoids the issue I've been experiencing with the iterator.

I'd really like to see this change make it to the code if possible (I can 
submit a patch if necessary).

For reference, I'm using PHP 5.3 on OSX.

Best regards
Rich
  
> PHP Solr Client
> ---
>
> Key: SOLR-341
> URL: https://issues.apache.org/jira/browse/SOLR-341
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - php
>Affects Versions: 1.2
> Environment: PHP >= 5.2.0 (or older with JSON PECL extension or other 
> json_decode function implementation). Solr >= 1.2
>Reporter: Donovan Jimenez
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SolrPhpClient.2008-09-02.zip, SolrPhpClient.zip
>
>
> Developed this client when the example PHP source didn't meet our needs.  The 
> company I work for agreed to release it under the terms of the Apache License.
> This version is slightly different from what I originally linked to on the 
> dev mailing list.  I've incorporated feedback from Yonik and "hossman" to 
> simplify the client and only accept one response format (JSON currently).
> When Solr 1.3 is released the client can be updated to use the PHP or 
> Serialized PHP response writer.
> example usage from my original mailing list post:
>  require_once('Solr/Service.php');
> $start = microtime(true);
> $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
> 8180, '/solr');
> try
> {
> $response = $solr->search('solr', 0, 10,
> array(/* you can include other parameters here */));
> echo 'search returned with status = ', 
> $response->responseHeader->status,
> ' and took ', microtime(true) - $start, ' seconds', "\n";
> //here's how you would access results
> //Notice that I've mapped the values by name into a tree of stdClass 
> objects
> //and arrays (actually, most of this is done by json_decode )
> if ($response->response->numFound > 0)
> {
> $doc_number = $response->response->start;
> foreach ($response->response->docs as $doc)
> {
> $doc_number++;
> echo $doc_number, ': ', $doc->text, "\n";
> }
> }
> //for the purposes of seeing the available structure of the response
> //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
> the response before
> //any values are accessed may result in different behavior (in case
> //anyone has some troubles debugging)
> //print_r($response);
> }
> catch (Exception $e)
> {
> echo $e->getMessage(), "\n";
> }
> ?>

[jira] Created: (SOLR-856) Suport for "Accept-Encoding : gzip" in SolrDispatchFilter

2008-11-14 Thread Noble Paul (JIRA)
Suport for "Accept-Encoding : gzip" in SolrDispatchFilter
-

 Key: SOLR-856
 URL: https://issues.apache.org/jira/browse/SOLR-856
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul


If the client sends an Accept-Encoding : gzip header then SolrDispatchFilter 
should respect that and send back data as zipped

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-856) Suport for "Accept-Encoding : gzip" in SolrDispatchFilter

2008-11-14 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-856:


Attachment: SOLR-856.patch

implemented

> Suport for "Accept-Encoding : gzip" in SolrDispatchFilter
> -
>
> Key: SOLR-856
> URL: https://issues.apache.org/jira/browse/SOLR-856
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
> Attachments: SOLR-856.patch
>
>
> If the client sends an Accept-Encoding : gzip header then SolrDispatchFilter 
> should respect that and send back data as zipped

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-667) Alternate LRUCache implementation

2008-11-14 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647587#action_12647587
 ] 

Shalin Shekhar Mangar commented on SOLR-667:


Yonik, what do you think about using a new cleanup thread?

> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Yonik Seeley
> Fix For: 1.4
>
> Attachments: ConcurrentLRUCache.java, ConcurrentLRUCache.java, 
> ConcurrentLRUCache.java, SOLR-667-alternate.patch, SOLR-667-alternate.patch, 
> SOLR-667-updates.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-856) Suport for "Accept-Encoding : gzip" in SolrDispatchFilter

2008-11-14 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647589#action_12647589
 ] 

Erik Hatcher commented on SOLR-856:
---

I'd rather see this implemented as a separate filter, and not blending this in 
with SolrDispatchFilter.  Why?  Separation of concerns.  I have ideas of using 
a custom dispatch filter (for smarter request routing, cleaner URLs), but would 
still want to benefit from gzip.

> Suport for "Accept-Encoding : gzip" in SolrDispatchFilter
> -
>
> Key: SOLR-856
> URL: https://issues.apache.org/jira/browse/SOLR-856
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
> Attachments: SOLR-856.patch
>
>
> If the client sends an Accept-Encoding : gzip header then SolrDispatchFilter 
> should respect that and send back data as zipped

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-852) Refactor common code in various handlers for working with ContentStream Objects

2008-11-14 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647591#action_12647591
 ] 

Erik Hatcher commented on SOLR-852:
---

The setErr(or)Header is in the code you committed, fyi.

> Refactor common code in various handlers for working with ContentStream 
> Objects
> ---
>
> Key: SOLR-852
> URL: https://issues.apache.org/jira/browse/SOLR-852
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Attachments: SOLR-852.patch, SOLR-852.patch
>
>
> See 
> http://lucene.markmail.org/message/srbnucwor6kyxv2e?q=ContentStream+refactor
> There is a fair amount of shared code between the XMLUpdateRequestHandler and 
> the CSVRequestHandler (and the soon to be RichDocumentHandler).  Let's 
> refactor into a common set of reusable pieces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-84) Logo Contests

2008-11-14 Thread Lozer T. User (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lozer T. User updated SOLR-84:
--

Attachment: logo_remake.jpg

This is the Jpeg version called logo_remake.

> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.jpg, 
> logo_remake.svg, solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, 
> sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-84) Logo Contests

2008-11-14 Thread Lozer T. User (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lozer T. User updated SOLR-84:
--

Attachment: logo_remake.svg

Since it is getting close to the deadline and I don't want to see moritz's 
entry not qualify, I'll enter it in with the apache name.

> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.svg, 
> solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, 
> sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-856) Suport for "Accept-Encoding : gzip" in SolrDispatchFilter

2008-11-14 Thread Donovan Jimenez (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647612#action_12647612
 ] 

Donovan Jimenez commented on SOLR-856:
--

Isn't this usually handled as a configuration of the Container rather than at 
the servlet level?

For example, in Tomcat, I'd use the compression option on the HTTP connector 
described in the documentation: 
http://tomcat.apache.org/tomcat-6.0-doc/config/http.html

I'm sure other containers have similar, already existing, functionality.

> Suport for "Accept-Encoding : gzip" in SolrDispatchFilter
> -
>
> Key: SOLR-856
> URL: https://issues.apache.org/jira/browse/SOLR-856
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
> Attachments: SOLR-856.patch
>
>
> If the client sends an Accept-Encoding : gzip header then SolrDispatchFilter 
> should respect that and send back data as zipped

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647618#action_12647618
 ] 

Grant Ingersoll commented on SOLR-284:
--

Question for the people watching this:

Would you prefer a new wiki page and keep the old one for those using 
Chris/Eric's patch, or would you rather I overwrite/edit the current one?

FWIW, some of the parameters will be the same, but I'm also adding in quite a 
bit more: boosting, XPath expression support (Tika returns everything as XHTML, 
so it then becomes possible to restrict down what parts you want to pay 
attention to), extraction only (i.e. no indexing), support for metadata 
extraction and indexing, support for sending in "literals" which are like the 
current fieldnames parameter and likely some other pieces.

FYI: Out of the box, Tika has support for: 
http://incubator.apache.org/tika/formats.html and I know they are adding more 
things as well, like Flash, etc.

It should also be noted, that if you are just indexing metadata about a file, 
it makes more sense to do the work on the client side.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647619#action_12647619
 ] 

Erik Hatcher commented on SOLR-284:
---

I'd rather see the old (err, current) wiki page replaced/renamed, and kept 
current with the latest patch/commit from this issue.  Nice work Grant!

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-854) Add 'run example' to build.xml

2008-11-14 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-854.
---

   Resolution: Fixed
Fix Version/s: 1.4
 Assignee: Erik Hatcher

Thanks Mark!   I've always hacked something like this in to projects for easy 
running.  I refined it a little, making it possible to launch with a different 
Solr home and data directory:

   ant run-example -Dexample.solr.home= -Dexample.data.dir=



> Add 'run example' to build.xml
> --
>
> Key: SOLR-854
> URL: https://issues.apache.org/jira/browse/SOLR-854
> Project: Solr
>  Issue Type: New Feature
>Reporter: Mark Miller
>Assignee: Erik Hatcher
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-854.patch
>
>
> Working in eclipse, I find it really convenient for debugging/testing to have 
> a 'run-example' target in the build file. Anyone else?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Chris Harris (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647632#action_12647632
 ] 

Chris Harris commented on SOLR-284:
---

Grant,

I don't really care if you take over the old wiki page's name or start a new 
one; maybe it depends on if the updated handler is still going to have a 
similar name or be called something else. I do think, though, that it might be 
handy nice to have *some* wiki page (and maybe some JIRA issue) to maintain the 
older patch on a temporary basis.

Thanks,
Chris

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)
Memory Leak during the indexing of large xml files
--

 Key: SOLR-857
 URL: https://issues.apache.org/jira/browse/SOLR-857
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
 Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
Reporter: Ruben Jimenez


While indexing a set of SOLR xml files that contain 5000 document adds within 
them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
memory until the heap is exhausted, while the same files are indexed without 
issue with SOLR 1.2.

Steps used to reproduce.

1 - Download SOLR 1.3
2 - Modify example schema.xml to match fields required
3 - start example server with following command java -Xms512m -Xmx1024m 
-XX:MaxPermSize=128m -jar start.jar
4 - Index files as follow java -Xmx128m -jar .../examples/exampledocs/post.jar 
*.xml

Directory with xml files contains about 100 xml files each of about 30MB each.  
While indexing after about the 25th file SOLR 1.3 runs out of memory, while 
SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-667) Alternate LRUCache implementation

2008-11-14 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647657#action_12647657
 ] 

Yonik Seeley commented on SOLR-667:
---

The ability to use a separate cleanup thread is interesting... but I'm not sure 
that having the ability to spin off a new thread for *each* cleanup is 
something one would ever want to do.  The cleanup thread logic should probably 
be fixed too (no sleeping and polling... it should wait until notified that a 
cleanup is needed)

> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Yonik Seeley
> Fix For: 1.4
>
> Attachments: ConcurrentLRUCache.java, ConcurrentLRUCache.java, 
> ConcurrentLRUCache.java, SOLR-667-alternate.patch, SOLR-667-alternate.patch, 
> SOLR-667-updates.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Jimenez updated SOLR-857:
---

Attachment: schema.xml

This is the schema file that we are using.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-84) Logo Contests

2008-11-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647660#action_12647660
 ] 

Doug Cutting commented on SOLR-84:
--

I like 
https://issues.apache.org/jira/secure/attachment/12349896/logo-solr-e.jpg and 
https://issues.apache.org/jira/secure/attachment/12358494/sslogo-solr.jpg, 
because they're simple and scale down well.  It should be possible to scale the 
logo, or a salient part of it, as small as a favicon (16x16) and still have it 
easily recognized.  Most of the designs above require a lot of pixels to be 
recognizable.  A good logo should be iconic more than textual--an abstract 
symbol.

Often you can sample an element of a logo to form a favicon (like we do with 
Lucene's 'L').  So, when voting, think about whether there's an easily 
identifiable sample (e.g., is the typeface of the 'S' distinctive?).

I note that Steve Stedman did not provide his logos under the Apache license.  
Was that intentional?  I like his quite a lot...



> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.jpg, 
> logo_remake.svg, solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, 
> sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Jimenez updated SOLR-857:
---

Attachment: OQ_SOLR_1.xml.zip

This is a sample of the files that we are indexing.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-84) Logo Contests

2008-11-14 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647662#action_12647662
 ] 

Erik Hatcher commented on SOLR-84:
--

I just pinged Steve Stedman (a personal friend) about getting his contributions 
ASL'd.   Thanks for noting that Doug.

> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.jpg, 
> logo_remake.svg, solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, 
> sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647670#action_12647670
 ] 

Grant Ingersoll commented on SOLR-284:
--

OK, I've created http://wiki.apache.org/solr/ExtractingRequestHandler and 
linked it from the old page.  I will have a preliminary patch up today.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647676#action_12647676
 ] 

Otis Gospodnetic commented on SOLR-857:
---

What is your ramBufferSizeMB set to?
What happens if you commit periodically?

I suggest you take this so [EMAIL PROTECTED] first.  This may not be a bug, but 
simply operations/config issue.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647679#action_12647679
 ] 

Ruben Jimenez commented on SOLR-857:


changing the ramBufferSizeMB in solrconfig didn't seem to help.  I tried 
varying that as well as the mergefactor, but neither seemed to help.

I've tried testing then entire set as well as in batches of 10 files at a time. 
 In both cases the failure occurred at the same time unless I completely 
restarted at some point before the 25th file.

I'll send something out to the mailing list as well.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647683#action_12647683
 ] 

Mark Miller commented on SOLR-857:
--

Do any of the files fail or have errors? It almost looks like maybe the parser 
won't be closed on an exception.

I'm looping in copies of your test file and no trouble so far - memory 
footprint looks decent. Surviving generations is on a constant upward path, so 
thats a bit worrying, but so far memory looks good.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Bill Au (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647684#action_12647684
 ] 

Bill Au commented on SOLR-857:
--

Have you try starting your JVM with -XX:-HeapDumpOnOutOfMemoryError and then 
looking at the heap dump?

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Bill Au (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647684#action_12647684
 ] 

billa edited comment on SOLR-857 at 11/14/08 10:24 AM:
-

Have you tried starting your JVM with -XX:-HeapDumpOnOutOfMemoryError and then 
looking at the heap dump?

  was (Author: billa):
Have you try starting your JVM with -XX:-HeapDumpOnOutOfMemoryError and 
then looking at the heap dump?
  
> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647685#action_12647685
 ] 

Ruben Jimenez commented on SOLR-857:


I've double checked that there are no file errors and have tested with 
different batches of files with more or less the same results.  Essentially 
what I see is that memory usage continues to rise during the indexing process 
until the heap is exhausted.  Unfortunately only one file isn't enough to 
reproduce the error.  I can provide more but figured I'd hold off on that until 
asked due to the size of each one.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Ruben Jimenez (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647695#action_12647695
 ] 

Ruben Jimenez commented on SOLR-857:


Bill,

I haven't tried the -XX:-HeapDumpOnOutOfMemoryError yet.  I'll give that a go 
and post the results.


> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Mark Miller

Ruben Jimenez (JIRA) wrote:

  Unfortunately only one file isn't enough to reproduce the error.  I can 
provide more but figured I'd hold off on that until asked due to the size of 
each one.
  

I just copied it over 30 times after modifying it to allow dupes.


BitDocSet code oddity

2008-11-14 Thread Mark Miller
Maybe for the future? Wondering, because it looks odd at the moment - 
OpenBitSet doesn't extend DocSet, so the if can never be true. Something 
seems odd anyway - or maybe its just waiting for future code changes...


public DocSet andNot(DocSet other) {
   OpenBitSet newbits = (OpenBitSet)(bits.clone());
if (other instanceof OpenBitSet) {
  newbits.andNot(((BitDocSet)other).bits);
} else {
  DocIterator iter = other.iterator();
  while (iter.hasNext()) newbits.clear(iter.nextDoc());
}
return new BitDocSet(newbits);
 }


Re: BitDocSet code oddity

2008-11-14 Thread Yonik Seeley
Thanks Mark, looks like it should be if (other instanceof BitDocSet).

On Fri, Nov 14, 2008 at 1:58 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
> Maybe for the future? Wondering, because it looks odd at the moment -
> OpenBitSet doesn't extend DocSet, so the if can never be true. Something
> seems odd anyway - or maybe its just waiting for future code changes...
>
> public DocSet andNot(DocSet other) {
>   OpenBitSet newbits = (OpenBitSet)(bits.clone());
>if (other instanceof OpenBitSet) {
>  newbits.andNot(((BitDocSet)other).bits);
>} else {
>  DocIterator iter = other.iterator();
>  while (iter.hasNext()) newbits.clear(iter.nextDoc());
>}
>return new BitDocSet(newbits);
>  }
>


Re: BitDocSet code oddity

2008-11-14 Thread Yonik Seeley
Could you open up a JIRA so we can refer to it in CHANGES.txt?
It's a trivial change that I might normally put there, but the
performance impact is probably non-trivial for  negative queries (in
main query or in filters).

-Yonik

On Fri, Nov 14, 2008 at 2:04 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> Thanks Mark, looks like it should be if (other instanceof BitDocSet).
>
> On Fri, Nov 14, 2008 at 1:58 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
>> Maybe for the future? Wondering, because it looks odd at the moment -
>> OpenBitSet doesn't extend DocSet, so the if can never be true. Something
>> seems odd anyway - or maybe its just waiting for future code changes...
>>
>> public DocSet andNot(DocSet other) {
>>   OpenBitSet newbits = (OpenBitSet)(bits.clone());
>>if (other instanceof OpenBitSet) {
>>  newbits.andNot(((BitDocSet)other).bits);
>>} else {
>>  DocIterator iter = other.iterator();
>>  while (iter.hasNext()) newbits.clear(iter.nextDoc());
>>}
>>return new BitDocSet(newbits);
>>  }
>>
>


[jira] Reopened: (SOLR-465) Add configurable DirectoryProvider so that alternate Directory implementations can be specified via solrconfig.xml

2008-11-14 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reopened SOLR-465:
---


reopening.  LUCENE-1451 has been committed, adding public constructors to 
FSDirectory and subclasses.

> Add configurable DirectoryProvider so that alternate Directory 
> implementations can be specified via solrconfig.xml
> --
>
> Key: SOLR-465
> URL: https://issues.apache.org/jira/browse/SOLR-465
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
>Reporter: TJ Laurenzo
> Fix For: 1.4
>
> Attachments: SOLR-465.patch, SOLR-465.patch, 
> solr-directory-provider.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Solr is presently hard-coded to use the FSDirectory implementation in Lucene. 
>  Other Directory implementations are possible.  This patch creates a new 
> DirectoryProvider interface and extends SolrCore to load an implementation of 
> it from solrconfig.xml (if specified).  If not specified, then it will 
> fallback to the FSDirectory.
> A DirectoryProvider plugin can be configured in solrconfig.xml with the 
> following XML:
>
>   
>
> This patch was created against solr trunk checked out on 11/20/2007.  Most of 
> it is new code and should apply cleanly or with minor relocation.  If it does 
> not, let me know and I will update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-84) Logo Contests

2008-11-14 Thread Steve Stedman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Stedman updated SOLR-84:
--

Attachment: sslogo-solr-classic.png
source_logo-solr.zip

Attached is my revised logo with the full Apache Solr project name and the 
source files. I can also revise my other previous logos (and correcting the 
license issue) if it seems worthwhile.

> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.jpg, 
> logo_remake.svg, solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> source_logo-solr.zip, sslogo-solr-classic.png, sslogo-solr-flare.jpg, 
> sslogo-solr.jpg, sslogo-solr2-flare.jpg, sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-858) BitDocSet.andNot calls instanceof on the wrong type

2008-11-14 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-858:
-

Summary: BitDocSet.andNot calls instanceof on the wrong type  (was: 
BitDocSet.andNot casts to the wrong type)

> BitDocSet.andNot calls instanceof on the wrong type
> ---
>
> Key: SOLR-858
> URL: https://issues.apache.org/jira/browse/SOLR-858
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Mark Miller
>Priority: Trivial
> Fix For: 1.3.1
>
>
> method andNot(DocSet other) checks for instance of OpenBitSet - should be 
> BitDocSet. Looks like you lose an optimization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-858) BitDocSet.andNot casts to the wrong type

2008-11-14 Thread Mark Miller (JIRA)
BitDocSet.andNot casts to the wrong type


 Key: SOLR-858
 URL: https://issues.apache.org/jira/browse/SOLR-858
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Mark Miller
Priority: Trivial
 Fix For: 1.3.1


method andNot(DocSet other) checks for instance of OpenBitSet - should be 
BitDocSet. Looks like you lose an optimization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-620) Velocity Response Writer

2008-11-14 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647706#action_12647706
 ] 

Grant Ingersoll commented on SOLR-620:
--

Just noticed that this contrib is not added into the release/signing/packaging 
stuff in the top level build.

> Velocity Response Writer
> 
>
> Key: SOLR-620
> URL: https://issues.apache.org/jira/browse/SOLR-620
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
>Priority: Minor
> Attachments: SOLR-620.patch, SOLR-620.patch, SOLR-620.patch, 
> SOLR-620.patch, SOLR-620.zip, SOLR-620.zip
>
>
> Add a Velocity - http://velocity.apache.org - response writer, making it 
> possible to generate a decent search UI straight from Solr itself.  Designed 
> to work standalone or in conjunction with the JSON response writer (or 
> SolrJS) for Ajaxy things.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-84) Logo Contests

2008-11-14 Thread Steve Stedman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Stedman updated SOLR-84:
--

Attachment: sslogo-solr-song.png
sslogo-solr-light.png
sslogo-solr-70s.png

...and some additional variations. Thanks for the heads-up, Erik.

> Logo Contests
> -
>
> Key: SOLR-84
> URL: https://issues.apache.org/jira/browse/SOLR-84
> Project: Solr
>  Issue Type: Improvement
>Reporter: Bertrand Delacretaz
>Priority: Minor
> Attachments: apache-solr-004.png, apache_solr_burning.png, 
> apache_solr_contour.png, apache_solr_sun.png, logo-grid.jpg, logo-solr-d.jpg, 
> logo-solr-e.jpg, logo-solr-source-files-take2.zip, logo_remake.jpg, 
> logo_remake.svg, solr-84-source-files.zip, solr-f.jpg, solr-greyscale.png, 
> solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, 
> solr-logo.jpg, solr-nick.gif, solr.jpg, solr.png, solr.s1.jpg, solr.s2.jpg, 
> solr.s3.jpg, solr.svg, solr_attempt.jpg, solr_attempt2.jpg, 
> source_logo-solr.zip, sslogo-solr-70s.png, sslogo-solr-classic.png, 
> sslogo-solr-flare.jpg, sslogo-solr-light.png, sslogo-solr-song.png, 
> sslogo-solr.jpg, sslogo-solr2-flare.jpg, sslogo-solr2.jpg, sslogo-solr3.jpg
>
>
> This issue was original a scratch pad for various ideas for new Logos.  It is 
> now being used as a repository for submissions for the Solr Logo Contest...
>http://wiki.apache.org/solr/LogoContest
> Note that many of the images currently attached are not eligible for the 
> contest since they do not meet the official guidelines for new Apache project 
> logos (in particular that the full project name "Apache Solr" must be 
> included in the Logo).  Only eligible attachments will be included in the 
> official voting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-84) Logo Contests

2008-11-14 Thread Mike Klaas


On 14-Nov-08, at 8:54 AM, Doug Cutting (JIRA) wrote:



   [ https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647660 
#action_12647660 ]


Doug Cutting commented on SOLR-84:
--

I like https://issues.apache.org/jira/secure/attachment/12349896/logo-solr-e.jpg 
 and https://issues.apache.org/jira/secure/attachment/12358494/sslogo-solr.jpg 
, because they're simple and scale down well.  It should be possible  
to scale the logo, or a salient part of it, as small as a favicon  
(16x16) and still have it easily recognized.  Most of the designs  
above require a lot of pixels to be recognizable.  A good logo  
should be iconic more than textual--an abstract symbol.


Often you can sample an element of a logo to form a favicon (like we  
do with Lucene's 'L').  So, when voting, think about whether there's  
an easily identifiable sample (e.g., is the typeface of the 'S'  
distinctive?).


Lots of the designs do have distinctive "suns" that would make good  
favicons (after re-vectorizing; those gradients would not rescale  
nicely).


-Mike


[jira] Created: (SOLR-859) solr-ruby rewrite, jruby support

2008-11-14 Thread Matt Mitchell (JIRA)
solr-ruby rewrite, jruby support


 Key: SOLR-859
 URL: https://issues.apache.org/jira/browse/SOLR-859
 Project: Solr
  Issue Type: Improvement
  Components: clients - ruby - flare
Affects Versions: 1.3
Reporter: Matt Mitchell


Here is a re-write of the solr ruby client library. This is a simplified set of 
features, but really only missing the mapper/indexer code that's present in the 
current solr-ruby. Bringing those in will be the next step.

Tests are in test. Please read the README, there are instructions for running 
the tests.

A Solr 1.3 distribution is including for tests.

* find patch in file uploads "rewrite.1.patch"

- Matt Mitchell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-859) solr-ruby rewrite, jruby support

2008-11-14 Thread Matt Mitchell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Mitchell updated SOLR-859:
---

Attachment: rewrite.1.patch

Apply this patch to a checkout of:

https://svn.apache.org/repos/asf/lucene/solr/branches/solr-ruby-refactoring

> solr-ruby rewrite, jruby support
> 
>
> Key: SOLR-859
> URL: https://issues.apache.org/jira/browse/SOLR-859
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - ruby - flare
>Affects Versions: 1.3
>Reporter: Matt Mitchell
> Attachments: rewrite.1.patch
>
>
> Here is a re-write of the solr ruby client library. This is a simplified set 
> of features, but really only missing the mapper/indexer code that's present 
> in the current solr-ruby. Bringing those in will be the next step.
> Tests are in test. Please read the README, there are instructions for running 
> the tests.
> A Solr 1.3 distribution is including for tests.
> * find patch in file uploads "rewrite.1.patch"
> - Matt Mitchell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r714115 - /lucene/solr/trunk/example/solr/lib/

2008-11-14 Thread Ryan McKinley


On Nov 14, 2008, at 2:48 PM, [EMAIL PROTECTED] wrote:


Author: gsingers
Date: Fri Nov 14 11:48:47 2008
New Revision: 714115

URL: http://svn.apache.org/viewvc?rev=714115&view=rev
Log:
put in the lib dir, as we're going to have some libs to put in there  
for SOLR-282




will SOLR-282 be a 'contrib' package?

In general we should figure out what we want the 'example' directory  
to be.  Currently each contrib has a full copy of it -- i don't think  
that is ideal.  Perhaps we have a 'minimum' example, and then a  
kitchen sink example...


ryan


Classloader and SolrResourceLoader fun

2008-11-14 Thread Grant Ingersoll
I'm working on SOLR-282.  I'm building it out as a contrib.  The  
contrib has some source code implementing an ExtractingReqHandler  
(ERH) and the Tika library.  For testing, I copied the JAR file I  
create for the contrib and the Tika library into solr/lib and fire up  
the example, which leads to a NoClassDefFound exception (see [1])  
caused by the classloader that is loading the ERH failing to find the  
Tika dependencies, even though the Tika library is in the lib directory.


So, in debugging this, I noticed that we construct two different  
SolrResourceLoaders, one right after the other in the CoreContainer at  
lines 117 and 118.  Well, we create a SRL at 117, then the SolrConfig  
creates one in it's constructor, which gets called at 118.


I'm not sure if this is the problem, but it leads to [2] in my log  
files, which seems really weird.


Anyone with some classloader magic knowledge have any insight?

I also tried merging my contrib jar and the tika jar together, but  
that didn't resolve anything.


Thanks,
Grant

[1]
Nov 14, 2008 5:10:29 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NoClassDefFoundError: org/apache/tika/exception/ 
TikaException

at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at  
org 
.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java: 
260)
at  
org 
.apache 
.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:281)
at  
org 
.apache 
.solr 
.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:153)
at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:138)
at  
org 
.apache 
.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java: 
141)
at  
org 
.apache 
.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java: 
170)

at org.apache.solr.core.SolrCore.(SolrCore.java:527)
at org.apache.solr.core.CoreContainer 
$Initializer.initialize(CoreContainer.java:120)
at  
org 
.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at  
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)


Caused by: java.lang.ClassNotFoundException:  
org.apache.tika.exception.TikaException

at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:316)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at  
org 
.mortbay 
.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375)
at  
org 
.mortbay 
.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:337)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java: 
374)



[2]  Note, in this case I merged the Tika and the Solr contrib jars  
together:


INFO: Solr home set to 'solr/'
Nov 14, 2008 5:07:29 PM org.apache.solr.core.SolrResourceLoader  
createClassLoader
INFO: Adding 'file:.../solr-clean/example/solr/lib/apache-solr- 
extraction-1.4-dev.jar' to Solr classloader
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
locateInstanceDir

INFO: JNDI not configured for solr (NoInitialContextEx)
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
locateInstanceDir
INFO: solr home defaulted to 'solr/' (could not find system property  
or JNDI)

Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/'
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
createClassLoader
INFO: Adding 'file:.../solr-clean/example/solr/lib/apache-solr- 
extraction-1.4-dev.jar' to Solr classloader

Nov 14, 2008 5:10:15 PM org.apache.solr.core.SolrConfig 
INFO: Loaded SolrConfig: solrconfig.xml


Re: Classloader and SolrResourceLoader fun

2008-11-14 Thread Grant Ingersoll
Never mind.  Found the problem.  There was a name collision due to  
overlap between using Tika's standalone JAR (which packages all of the  
dependencies into a single JAR) and some Solr deps, such that the  
wrong classloader was being invoked.


Although, it still does seem weird that two SolrResourceLoader classes  
get created.


-Grant

On Nov 14, 2008, at 5:24 PM, Grant Ingersoll wrote:

I'm working on SOLR-282.  I'm building it out as a contrib.  The  
contrib has some source code implementing an ExtractingReqHandler  
(ERH) and the Tika library.  For testing, I copied the JAR file I  
create for the contrib and the Tika library into solr/lib and fire  
up the example, which leads to a NoClassDefFound exception (see [1])  
caused by the classloader that is loading the ERH failing to find  
the Tika dependencies, even though the Tika library is in the lib  
directory.


So, in debugging this, I noticed that we construct two different  
SolrResourceLoaders, one right after the other in the CoreContainer  
at lines 117 and 118.  Well, we create a SRL at 117, then the  
SolrConfig creates one in it's constructor, which gets called at 118.


I'm not sure if this is the problem, but it leads to [2] in my log  
files, which seems really weird.


Anyone with some classloader magic knowledge have any insight?

I also tried merging my contrib jar and the tika jar together, but  
that didn't resolve anything.


Thanks,
Grant

[1]
Nov 14, 2008 5:10:29 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NoClassDefFoundError: org/apache/tika/exception/ 
TikaException

   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at  
org 
.apache 
.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:260)
   at  
org 
.apache 
.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:281)
   at  
org 
.apache 
.solr 
.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84)
   at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:153)
   at org.apache.solr.core.RequestHandlers 
$1.create(RequestHandlers.java:138)
   at  
org 
.apache 
.solr 
.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
   at  
org 
.apache 
.solr 
.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:170)

   at org.apache.solr.core.SolrCore.(SolrCore.java:527)
   at org.apache.solr.core.CoreContainer 
$Initializer.initialize(CoreContainer.java:120)
   at  
org 
.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java: 
69)
   at  
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)


Caused by: java.lang.ClassNotFoundException:  
org.apache.tika.exception.TikaException

   at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:316)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
   at  
org 
.mortbay 
.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:375)
   at  
org 
.mortbay 
.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:337)
   at java.lang.ClassLoader.loadClassInternal(ClassLoader.java: 
374)



[2]  Note, in this case I merged the Tika and the Solr contrib jars  
together:


INFO: Solr home set to 'solr/'
Nov 14, 2008 5:07:29 PM org.apache.solr.core.SolrResourceLoader  
createClassLoader
INFO: Adding 'file:.../solr-clean/example/solr/lib/apache-solr- 
extraction-1.4-dev.jar' to Solr classloader
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
locateInstanceDir

INFO: JNDI not configured for solr (NoInitialContextEx)
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
locateInstanceDir
INFO: solr home defaulted to 'solr/' (could not find system property  
or JNDI)

Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/'
Nov 14, 2008 5:08:07 PM org.apache.solr.core.SolrResourceLoader  
createClassLoader
INFO: Adding 'file:.../solr-clean/example/solr/lib/apache-solr- 
extraction-1.4-dev.jar' to Solr classloader

Nov 14, 2008 5:10:15 PM org.apache.solr.core.SolrConfig 
INFO: Loaded SolrConfig: solrconfig.xml





[jira] Updated: (SOLR-341) PHP Solr Client

2008-11-14 Thread Donovan Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donovan Jimenez updated SOLR-341:
-

Attachment: SolrPhpClient.2008-11-14.zip

Rich Robinson helped me to track down an issue where he found that documents 
with values = false were creating prematurely exiting foreach loops (when 
iterating a document). I determined it was the valid() implementation of the 
Iterator iterface, and after fussing with it decided to just implement 
IteratorAggregator instead and use the pre-existing SPL ArrayIterator class. 
This simplifies the Apache_Solr_Document code and now all document values are 
looped even when some are false.

> PHP Solr Client
> ---
>
> Key: SOLR-341
> URL: https://issues.apache.org/jira/browse/SOLR-341
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - php
>Affects Versions: 1.2
> Environment: PHP >= 5.2.0 (or older with JSON PECL extension or other 
> json_decode function implementation). Solr >= 1.2
>Reporter: Donovan Jimenez
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SolrPhpClient.2008-09-02.zip, 
> SolrPhpClient.2008-11-14.zip, SolrPhpClient.zip
>
>
> Developed this client when the example PHP source didn't meet our needs.  The 
> company I work for agreed to release it under the terms of the Apache License.
> This version is slightly different from what I originally linked to on the 
> dev mailing list.  I've incorporated feedback from Yonik and "hossman" to 
> simplify the client and only accept one response format (JSON currently).
> When Solr 1.3 is released the client can be updated to use the PHP or 
> Serialized PHP response writer.
> example usage from my original mailing list post:
>  require_once('Solr/Service.php');
> $start = microtime(true);
> $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
> 8180, '/solr');
> try
> {
> $response = $solr->search('solr', 0, 10,
> array(/* you can include other parameters here */));
> echo 'search returned with status = ', 
> $response->responseHeader->status,
> ' and took ', microtime(true) - $start, ' seconds', "\n";
> //here's how you would access results
> //Notice that I've mapped the values by name into a tree of stdClass 
> objects
> //and arrays (actually, most of this is done by json_decode )
> if ($response->response->numFound > 0)
> {
> $doc_number = $response->response->start;
> foreach ($response->response->docs as $doc)
> {
> $doc_number++;
> echo $doc_number, ': ', $doc->text, "\n";
> }
> }
> //for the purposes of seeing the available structure of the response
> //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
> the response before
> //any values are accessed may result in different behavior (in case
> //anyone has some troubles debugging)
> //print_r($response);
> }
> catch (Exception $e)
> {
> echo $e->getMessage(), "\n";
> }
> ?>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r714115 - /lucene/solr/trunk/example/solr/lib/

2008-11-14 Thread Grant Ingersoll


On Nov 14, 2008, at 5:11 PM, Ryan McKinley wrote:



On Nov 14, 2008, at 2:48 PM, [EMAIL PROTECTED] wrote:


Author: gsingers
Date: Fri Nov 14 11:48:47 2008
New Revision: 714115

URL: http://svn.apache.org/viewvc?rev=714115&view=rev
Log:
put in the lib dir, as we're going to have some libs to put in  
there for SOLR-282




will SOLR-282 be a 'contrib' package?

In general we should figure out what we want the 'example' directory  
to be.  Currently each contrib has a full copy of it -- i don't  
think that is ideal.


Not following.  You mean the conf directory?  For SOLR-282, I hooked  
in a contrib-example target, similar to dist-contrib, that calls each  
contrib and asks it to do what it wants for the example.  In my case,  
I copy in some jars to the main example.



Perhaps we have a 'minimum' example, and then a kitchen sink  
example...





I've been thinking lately that we should look to simplify the mess  
that is the example directory...


FWIW, I rolled back the addition of the dir.  Came across some docs  
that said to go create it, so I assumed someone explicitly didn't  
create it for a reason.  Presumably to make sure no one accidentally  
checks in some jars, but I don't know for sure.


-Grant


[jira] Created: (SOLR-860) moreLikeThis Degug

2008-11-14 Thread Jeff (JIRA)
moreLikeThis Degug
--

 Key: SOLR-860
 URL: https://issues.apache.org/jira/browse/SOLR-860
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
 Environment: Gentoo Linux, Solr 1.4, tomcat webserver
Reporter: Jeff
 Fix For: 1.4


moreLikeThis searchcomponent currently has no way to debug or see information 
on the process.  This means that if moreLikeThis suggests another document 
there is no way to actually view why it picked that to hone the searching.  
Adding an explain would be extremely useful in determining the reasons why solr 
is recommending the items.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-284:
-

Attachment: SOLR-282.patch

First crack at this.  You'll need to download 
http://people.apache.org/~gsingers/extraction-libs.tar as it is too big to fit 
in JIRA.

There's probably lots wrong with it, so be gentle!  See 
http://wiki.apache.org/solr/ExtractingRequestHandler to get started.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, SOLR-282.patch, source.zip, 
> test-files.zip, test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2008-11-14 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647748#action_12647748
 ] 

Grant Ingersoll commented on SOLR-284:
--

Things to do:

1. Documentation
2. Way more testing, esp. unit tests of the various parameters
3. Update NOTICES and LICENSE.txt for the new dependencies.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, SOLR-282.patch, source.zip, 
> test-files.zip, test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-857) Memory Leak during the indexing of large xml files

2008-11-14 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647757#action_12647757
 ] 

Mark Miller commented on SOLR-857:
--

bq. I haven't tried the -XX:-HeapDumpOnOutOfMemoryError yet. I'll give that a 
go and post the results.

Please do. I looped trunk on that file 40 some times and got all of my memory 
back at the end. I'll try again on just 1.3 if taking a look at the heap 
doesn't help. Netbeans has a great heap inspector and I think IBM has a great 
one that can be used with much larger dumps.

> Memory Leak during the indexing of large xml files
> --
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) 
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
>Reporter: Ruben Jimenez
> Attachments: OQ_SOLR_1.xml.zip, schema.xml
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within 
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more 
> memory until the heap is exhausted, while the same files are indexed without 
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m 
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar 
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB 
> each.  While indexing after about the 25th file SOLR 1.3 runs out of memory, 
> while SOLR 1.2 is able to index the entire set of files without any problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r714115 - /lucene/solr/trunk/example/solr/lib/

2008-11-14 Thread Ryan McKinley


On Nov 14, 2008, at 5:37 PM, Grant Ingersoll wrote:



On Nov 14, 2008, at 5:11 PM, Ryan McKinley wrote:



On Nov 14, 2008, at 2:48 PM, [EMAIL PROTECTED] wrote:


Author: gsingers
Date: Fri Nov 14 11:48:47 2008
New Revision: 714115

URL: http://svn.apache.org/viewvc?rev=714115&view=rev
Log:
put in the lib dir, as we're going to have some libs to put in  
there for SOLR-282




will SOLR-282 be a 'contrib' package?

In general we should figure out what we want the 'example'  
directory to be.  Currently each contrib has a full copy of it -- i  
don't think that is ideal.


Not following.  You mean the conf directory?  For SOLR-282, I hooked  
in a contrib-example target, similar to dist-contrib, that calls  
each contrib and asks it to do what it wants for the example.  In my  
case, I copy in some jars to the main example.




ya -- datainporthandler and velocity each have a duplicate 'conf'  
secion:


http://svn.apache.org/repos/asf/lucene/solr/trunk/contrib/velocity/src/main/solr/
http://svn.apache.org/repos/asf/lucene/solr/trunk/contrib/dataimporthandler/src/test/resources/solr/

short of having solrconfig #include, I'm not sure what the best way  
might be to have the contribs only specify the part relevant to their  
function.   I'm afraid we will have to punt until spring is involved...


ryan


[jira] Commented: (SOLR-856) Suport for "Accept-Encoding : gzip" in SolrDispatchFilter

2008-11-14 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647797#action_12647797
 ] 

Noble Paul commented on SOLR-856:
-

bq.I'd rather see this implemented as a separate filter, and not blending this 
in with SolrDispatchFilter.

I considered that option too. But , it means users editing web.xml (which is 
not a good idea). Then the extra filter means more overhead. I guess we should 
not do that. 

bq.Isn't this usually handled as a configuration of the Container rather than 
at the servlet level? 

I would love to hear that? Bbut can we rely on that and expect that all 
containers will have this .Consider the fact that a feature relies on this 
(SOLR-829) for a critical functionality

> Suport for "Accept-Encoding : gzip" in SolrDispatchFilter
> -
>
> Key: SOLR-856
> URL: https://issues.apache.org/jira/browse/SOLR-856
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
> Attachments: SOLR-856.patch
>
>
> If the client sends an Accept-Encoding : gzip header then SolrDispatchFilter 
> should respect that and send back data as zipped

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-667) Alternate LRUCache implementation

2008-11-14 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647798#action_12647798
 ] 

Noble Paul commented on SOLR-667:
-

OK that is a good idea. But this is an important functinality. 

> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Yonik Seeley
> Fix For: 1.4
>
> Attachments: ConcurrentLRUCache.java, ConcurrentLRUCache.java, 
> ConcurrentLRUCache.java, SOLR-667-alternate.patch, SOLR-667-alternate.patch, 
> SOLR-667-updates.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, SOLR-667.patch, 
> SOLR-667.patch, SOLR-667.patch, SOLR-667.patch
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-859) solr-ruby rewrite, jruby support

2008-11-14 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647799#action_12647799
 ] 

Noble Paul commented on SOLR-859:
-

Why cant the JRuby version wrap SolrJ ? That way we get the performance 
benefits of wt=javabin

> solr-ruby rewrite, jruby support
> 
>
> Key: SOLR-859
> URL: https://issues.apache.org/jira/browse/SOLR-859
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - ruby - flare
>Affects Versions: 1.3
>Reporter: Matt Mitchell
> Attachments: rewrite.1.patch
>
>
> Here is a re-write of the solr ruby client library. This is a simplified set 
> of features, but really only missing the mapper/indexer code that's present 
> in the current solr-ruby. Bringing those in will be the next step.
> Tests are in test. Please read the README, there are instructions for running 
> the tests.
> A Solr 1.3 distribution is including for tests.
> * find patch in file uploads "rewrite.1.patch"
> - Matt Mitchell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-859) solr-ruby rewrite, jruby support

2008-11-14 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-859:


Comment: was deleted

> solr-ruby rewrite, jruby support
> 
>
> Key: SOLR-859
> URL: https://issues.apache.org/jira/browse/SOLR-859
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - ruby - flare
>Affects Versions: 1.3
>Reporter: Matt Mitchell
> Attachments: rewrite.1.patch
>
>
> Here is a re-write of the solr ruby client library. This is a simplified set 
> of features, but really only missing the mapper/indexer code that's present 
> in the current solr-ruby. Bringing those in will be the next step.
> Tests are in test. Please read the README, there are instructions for running 
> the tests.
> A Solr 1.3 distribution is including for tests.
> * find patch in file uploads "rewrite.1.patch"
> - Matt Mitchell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-861) SOLRJ Client does not release connections 'nicely' by default

2008-11-14 Thread Ian Holsman (JIRA)
SOLRJ Client does not release connections 'nicely' by default
-

 Key: SOLR-861
 URL: https://issues.apache.org/jira/browse/SOLR-861
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
 Environment: linux
Reporter: Ian Holsman


as-is the SolrJ Commons HttpServer uses the multi-threaded http connection 
manager. This manager seems to keep the connection alive for the client and 
does not close it when the object is dereferenced.

When you keep on opening new CommonsHttpSolrServer instances it results in a 
socket that is stuck in the CLOSE_WAIT state. Eventually this will use up all 
your available file handles, causing your client to die a painful death.

The solution I propose is that it uses a 'Simple' HttpConnectionManager which 
is set to not reuse connections if you don't specify a HttpClient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-861) SOLRJ Client does not release connections 'nicely' by default

2008-11-14 Thread Ian Holsman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Holsman updated SOLR-861:
-

Attachment: SimpleClient.patch

patch to change the default HttpClient SolrJ uses to be a more conservative 
option

> SOLRJ Client does not release connections 'nicely' by default
> -
>
> Key: SOLR-861
> URL: https://issues.apache.org/jira/browse/SOLR-861
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3
> Environment: linux
>Reporter: Ian Holsman
> Attachments: SimpleClient.patch
>
>
> as-is the SolrJ Commons HttpServer uses the multi-threaded http connection 
> manager. This manager seems to keep the connection alive for the client and 
> does not close it when the object is dereferenced.
> When you keep on opening new CommonsHttpSolrServer instances it results in a 
> socket that is stuck in the CLOSE_WAIT state. Eventually this will use up all 
> your available file handles, causing your client to die a painful death.
> The solution I propose is that it uses a 'Simple' HttpConnectionManager which 
> is set to not reuse connections if you don't specify a HttpClient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-853) Make DIH API friendly

2008-11-14 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-853:


Attachment: SOLR-853.patch

Removes dependency on non-common util directly. 

Need to add {{LuceneDataImporter}} and {{SolrJDataImporter }} and give examples 
on how this can be used externally

> Make DIH API friendly
> -
>
> Key: SOLR-853
> URL: https://issues.apache.org/jira/browse/SOLR-853
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
> Attachments: SOLR-853.patch
>
>
> DIH currently can only be run inside Solr. But the core of DIH is quite 
> independent of Solr. There are only a few points where it requires Solr core 
> classes.They can be isolated out and we have an API in hand. If we limit the 
> dependency down to common util then DIH can be used by 
>  * Lucene users directly
>  * Run DIH remotely with SolrJ
>  * By any other tools using Lucene as their underlying  datastore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-853) Make DIH API friendly

2008-11-14 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647801#action_12647801
 ] 

noble.paul edited comment on SOLR-853 at 11/14/08 9:49 PM:
---

Removes dependency on non-common util directly. 

Need to add {{LuceneDataImporter}} and {{SolrJDataImporter}} and give examples 
on how this can be used externally

  was (Author: noble.paul):
Removes dependency on non-common util directly. 

Need to add {{LuceneDataImporter}} and {{SolrJDataImporter }} and give examples 
on how this can be used externally
  
> Make DIH API friendly
> -
>
> Key: SOLR-853
> URL: https://issues.apache.org/jira/browse/SOLR-853
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Reporter: Noble Paul
> Attachments: SOLR-853.patch
>
>
> DIH currently can only be run inside Solr. But the core of DIH is quite 
> independent of Solr. There are only a few points where it requires Solr core 
> classes.They can be isolated out and we have an API in hand. If we limit the 
> dependency down to common util then DIH can be used by 
>  * Lucene users directly
>  * Run DIH remotely with SolrJ
>  * By any other tools using Lucene as their underlying  datastore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.