[jira] [Updated] (LUCENE-7768) Use a different stored field for highlighting

2017-04-06 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated LUCENE-7768:
--
Description: 
UnifiedHighlighter uses stored field content to highlight. It has some 
disadvantages, because index grows up fast when using multilingual indexing due 
to several fields has to be stored with same content. 

Lucene portion of issue SOLR-1105, initially raised on Solr
See https://issues.apache.org/jira/browse/SOLR-1105

  was:
UnifiedHighlighter uses stored field content to highlight. It has some 
disadvantages, because index grows up fast when using multilingual indexing due 
to several fields has to be stored with same content. This patch allows 
DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
external field.

Lucene portion of issue SOLR-1105, initially raised on Solr
See https://issues.apache.org/jira/browse/SOLR-1105


> Use a different stored field for highlighting
> -
>
> Key: LUCENE-7768
> URL: https://issues.apache.org/jira/browse/LUCENE-7768
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: Julien Martin
>
> UnifiedHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. 
> Lucene portion of issue SOLR-1105, initially raised on Solr
> See https://issues.apache.org/jira/browse/SOLR-1105



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1105) Use a different stored field for highlighting

2017-04-06 Thread Julien Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958671#comment-15958671
 ] 

Julien Martin commented on SOLR-1105:
-

Thanks for your comments David.

Automatic highlighting on copyField targets would be nice indeed.

I just created the Lucene portion issue at 
https://issues.apache.org/jira/browse/LUCENE-7768

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7768) Use a different stored field for highlighting

2017-04-06 Thread Julien Martin (JIRA)
Julien Martin created LUCENE-7768:
-

 Summary: Use a different stored field for highlighting
 Key: LUCENE-7768
 URL: https://issues.apache.org/jira/browse/LUCENE-7768
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: Julien Martin


UnifiedHighlighter uses stored field content to highlight. It has some 
disadvantages, because index grows up fast when using multilingual indexing due 
to several fields has to be stored with same content. This patch allows 
DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
external field.

Lucene portion of issue SOLR-1105, initially raised on Solr
See https://issues.apache.org/jira/browse/SOLR-1105



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-25 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-25 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-24 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-24 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1105) Use a different stored field for highlighting

2017-03-24 Thread Julien Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940671#comment-15940671
 ] 

Julien Martin commented on SOLR-1105:
-

Thank you for looking at it David! We really need the feature over here :)

As for unique field loading, my understanding is that the stored fields visitor 
pattern applied to the index searcher object ensures that no field is loaded 
twice per document. 

But this was a good point anyway because I had other issues with multiple 
fields highlighting which I solved in a new version of the patch you can find 
attached here.

Sincerely,
Julien

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-24 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-24 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
>Assignee: David Smiley
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936120#comment-15936120
 ] 

Julien Martin edited comment on SOLR-1105 at 3/22/17 11:05 AM:
---

Here is a patch proposal (SOLR-1105.patch).


was (Author: julm):
Here is a patch proposal.

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Comment: was deleted

(was: Here is a patch proposal.)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: (was: SOLR-1105.patch)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Comment: was deleted

(was: Here is a patch proposal.)

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

Here is a patch proposal.

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

Here is a patch proposal.

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Attachment: SOLR-1105.patch

Here is a patch proposal.

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936109#comment-15936109
 ] 

Julien Martin commented on SOLR-1105:
-

Here is a patch proposal for 

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-1105) Use a different stored field for highlighting

2017-03-22 Thread Julien Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Martin updated SOLR-1105:

Comment: was deleted

(was: Here is a patch proposal for )

> Use a different stored field for highlighting
> -
>
> Key: SOLR-1105
> URL: https://issues.apache.org/jira/browse/SOLR-1105
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Dmitry Lihachev
> Attachments: SOLR-1105-1_4_1.patch, 
> SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some 
> disadvantages, because index grows up fast when using multilingual indexing 
> due to several fields has to be stored with same content. This patch allows 
> DefaultSolrHighlighter to use "contentField" attribute to loockup content in 
> external field.
> Excerpt from old schema:
> {code:xml}
> 
> 
> 
> 
> {code}
> The same after patching, highlighter will now get content stored in "title" 
> field
> {code:xml}
> 
>  contentField="title"/>
>  contentField="title"/>
>  contentField="title"/>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-10128) langid.map.individual set to "true" is ignored

2017-02-12 Thread William Martin (JIRA)
William Martin created SOLR-10128:
-

 Summary: langid.map.individual set to "true" is ignored
 Key: SOLR-10128
 URL: https://issues.apache.org/jira/browse/SOLR-10128
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
 Environment: Solr 6.0.4+
Reporter: William Martin
Priority: Minor


The 
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor 
has a bug in it where it does not respect the "langid.map.individual" parameter 
in solrconfig.xml. The documentation for langid.map.individual specifies:
{quote}
If you require detecting languages separately for each field, supply 
langid.map.individual=true. The supplied fields will then be renamed according 
to detected language on an individual field basis.
{quote}
However, when this field is set to "true" the fields are still mapped to the 
language code of the entire document. For example: With the following snippet:
{code:xml|title=solrconfig.xml}

   
 title,text
 language_s
 true
 true
   

{code}
a document that takes the form
{code:javascript}
{
  "title": "This is an English title",
  "text": "Pero el texto de este documento está en español."
}
{code}
will be turned into
{code:javascript}
{
  "title_es": "This is an english title",
  "text_es": "Pero el texto de este documento está en español.",
  "language_s": ["es"]
}
{code}
rather than
{code:javascript}
{
  "title_en": "This is an english title",
  "text_es": "Pero el texto de este documento está en español.",
  "language_s": ["es","en"]
}
{code}
during processing.

This bug seems to have been introduced in SOLR-3881 when the abstract method
{code:java|title=LangDetectLanguageIdentifierUpdateProcessor.java}
protected List detectLanguage(String content)
{code}
was changed to the signature
{code:java|title=LangDetectLanguageIdentifierUpdateProcessor.java}
protected List detectLanguage(SolrInputDocument doc)
{code}
which does not allow one to recognize individual fields while preforming 
language detection. As it stands, the entire document is analyzed per 
individual field (included in the "langid.fl" or "langid.map.individual.fl" 
parameters) and the field is mapped to the language of the entire document.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7570) Tragic events during merges can lead to deadlock

2016-11-29 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin updated LUCENE-7570:

Attachment: thread_dump.txt

Reproduced in production with Lucene 6.1
Attaching extract from thread dump when it reproduced

> Tragic events during merges can lead to deadlock
> 
>
> Key: LUCENE-7570
> URL: https://issues.apache.org/jira/browse/LUCENE-7570
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 5.5, master (7.0)
>Reporter: Joey Echeverria
> Attachments: thread_dump.txt
>
>
> When an {{IndexWriter#commit()}} is stalled due to too many pending merges, 
> you can get a deadlock if the currently active merge thread hits a tragic 
> event.
> # The thread performing the commit synchronizes on the the {{commitLock}} in 
> {{commitInternal}}.
> # The thread goes on to to call {{ConcurrentMergeScheduler#doStall()}} which 
> {{waits()}} on the {{ConcurrentMergeScheduler}} object. This release the 
> merge scheduler's monitor lock, but not the {{commitLock}} in {{IndexWriter}}.
> # Sometime after this wait begins, the merge thread gets a tragic exception 
> can calls {{IndexWriter#tragicEvent()}} which in turn calls 
> {{IndexWriter#rollbackInternal()}}.
> # The {{IndexWriter#rollbackInternal()}} synchronizes on the {{commitLock}} 
> which is still held by the committing thread from (1) above which is waiting 
> on the merge(s) to complete. Hence, deadlock.
> We hit this bug with Lucene 5.5, but I looked at the code in the master 
> branch and it looks like the deadlock still exists there as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin updated LUCENE-7482:

Description: 
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we just have to delegate collection of 
the last matched documents.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

/** Sort used to sort the search results */
protected final Sort sort;
/** Number of documents to collect in each segment */
protected final int numDocsToCollect;
  
[...]

private List flushList = new ArrayList<>();


private static final class FlushData {
// ring buffer
int[] buffer;

// index of the first element in the buffer
int index;

LeafCollector leafCollector;

FlushData(int[] buffer, LeafCollector leafCollector) {
super();
this.buffer = buffer;
this.leafCollector = leafCollector;
}
}

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {

//flush previous data if any
flush();

LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort
int[] buffer = new int[numDocsToCollect];
Arrays.fill(buffer, -1);
FlushData flushData = new FlushData(buffer, 
in.getLeafCollector(context));
flushList.add(flushData);
return new LeafCollector() {
@Override
public void setScorer(Scorer scorer) throws IOException {
}

@Override
public void collect(int doc) throws IOException {
//we remember the last `numDocsToCollect` 
documents that matched
buffer[flushData.index % buffer.length] = doc;
flushData.index++;
}
};
} else {
return in.getLeafCollector(context);
}
}

//flush the last `numDocsToCollect` collected documents do the 
delegated Collector
public void flush() throws IOException {
for (FlushData flushData : flushList) {
for (int i = 0; i < flushData.buffer.length; i++) {
int doc = flushData.buffer[(flushData.index + i) % 
flushData.buffer.length];
if (doc != -1) {
flushData.leafCollector.collect(doc);
}
}
}
flushList.clear();
}

}
{code}

This is specially efficient when used along with TopFieldCollector as a lot of 
docValue lookup would not take place. 
In my experiment it reduced search time up to 90%.

Note 1: Does not support paging.
Note 2: Current implementation probably not thread safe



  was:

We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we just have to delegate collection of 
the last matched documents.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

/** Sort used to sort the search results */
protected fin

[jira] [Updated] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin updated LUCENE-7482:

Description: 

We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we just have to delegate collection of 
the last matched documents.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

/** Sort used to sort the search results */
protected final Sort sort;
/** Number of documents to collect in each segment */
protected final int numDocsToCollect;
  
[...]

private List flushList = new ArrayList<>();


private static final class FlushData {
// ring buffer
int[] buffer;

// index of the first element in the buffer
int index;

LeafCollector leafCollector;

FlushData(int[] buffer, LeafCollector leafCollector) {
super();
this.buffer = buffer;
this.leafCollector = leafCollector;
}
}

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {

//flush previous data if any
flush();

LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort
int[] buffer = new int[numDocsToCollect];
Arrays.fill(buffer, -1);
FlushData flushData = new FlushData(buffer, 
in.getLeafCollector(context));
flushList.add(flushData);
return new LeafCollector() {
@Override
public void setScorer(Scorer scorer) throws IOException {
}

@Override
public void collect(int doc) throws IOException {
//we remember the last `numDocsToCollect` 
documents that matched
buffer[flushData.index % buffer.length] = doc;
flushData.index++;
}
};
}else{
return in.getLeafCollector(context);
}
}

//flush the last `numDocsToCollect` collected documents do the 
delegated Collector
public void flush() throws IOException {
for (FlushData flushData : flushList) {
for (int i = 0; i < flushData.buffer.length; i++) {
int doc = flushData.buffer[(flushData.index + i) % 
flushData.buffer.length];
if (doc != -1) {
flushData.leafCollector.collect(doc);
}
}
}
flushList.clear();
}

}
{code}

This is specially efficient when used along with TopFieldCollector as a lot of 
docValue lookup would not take place. 
In my experiment it reduced search time up to 90%.

Note 1: Does not support paging.
Note 2: Current implementation probably not thread safe



  was:
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

  /

[jira] [Reopened] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin reopened LUCENE-7482:
-

> Faster sorted index search for reverse order search
> ---
>
> Key: LUCENE-7482
> URL: https://issues.apache.org/jira/browse/LUCENE-7482
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: AMIRAULT Martin
>Priority: Minor
>
> We are currently using Lucene here in my company for our main product.
> Our search functionnality is quite basic and the results are always sorted 
> given a predefined field. The user is only able to choose the sort order 
> (Asc/Desc).
> I am currently investigating using the index sort feature with 
> EarlyTerminationSortingCollector. 
> This is quite a shame searching on a sorted index in reverse order do not 
> have any optimization and was wondering if it would be possible to make it 
> faster by creating a special "ReverseSortingCollector" for this purpose.
> I am aware the posting list is designed to be always iterated in the same 
> order, so it is not about early-terminating the search but more about 
> filtering-out unneeded documents more efficiently.
> If a segment is sorted in reverse order, we can work out easily the docId 
> from which documents should be collected.
> Here is a sample quick code:
> {code:title=ReverseSortingCollector.java|borderStyle=solid}
> public class ReverseSortingCollector extends FilterCollector {
>   /** Sort used to sort the search results */
>   protected final Sort sort;
>   /** Number of documents to collect in each segment */
>   protected final int numDocsToCollect;
>   
> [...]
> @Override
> public LeafCollector getLeafCollector(LeafReaderContext context) throws 
> IOException {
> LeafReader reader = context.reader();
> Sort segmentSort = reader.getIndexSort();
> if (isReverseOrder(sort, segmentSort)) {//segment is sorted in 
> reverse order than the search sort
> 
>   //Here we can easily work out the docNum from which we 
> should collect
>   long collectFrom = context.reader().numDocs() - 
> numDocsToCollect;
>   
> return new FilterLeafCollector(in.getLeafCollector(context)) {
> @Override
> public void collect(int doc) throws IOException {
> if (doc >= collectFrom) {//only delegates 
> super.collect(doc);
> }
> }
> };
> }else{
>   return in.getLeafCollector(context);
>   }
>   }
>   
> }
> {code}
> This is specially efficient when used along with TopFieldCollector as a lot 
> of docValue lookup would not take place. 
> In my experiment it reduced search time by 90%.
> However I was wondering if it is correct, as my knowledge of Lucene is still 
> quite limited.
> Especially is it correct to assume that LeafReader docId always span from 
> 0=>LeafReader.numDocs() ?
> Note : Does not support paging. Could be eventually implemented by providing 
> a way to look up the docId to match from the last document collected (eg for 
> LongPoint querying the docId closest to the previously returned value...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15554014#comment-15554014
 ] 

AMIRAULT Martin commented on LUCENE-7482:
-

A more correct implementation: keep the last 'numDocsToCollect' docIds 
collected for each LeafCollector and flush them to delegated LeafCollectors 
once the search is finished. Will experiment a bit more (With less than 100% 
match this time!) and post a new proposal.

> Faster sorted index search for reverse order search
> ---
>
> Key: LUCENE-7482
> URL: https://issues.apache.org/jira/browse/LUCENE-7482
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: AMIRAULT Martin
>Priority: Minor
>
> We are currently using Lucene here in my company for our main product.
> Our search functionnality is quite basic and the results are always sorted 
> given a predefined field. The user is only able to choose the sort order 
> (Asc/Desc).
> I am currently investigating using the index sort feature with 
> EarlyTerminationSortingCollector. 
> This is quite a shame searching on a sorted index in reverse order do not 
> have any optimization and was wondering if it would be possible to make it 
> faster by creating a special "ReverseSortingCollector" for this purpose.
> I am aware the posting list is designed to be always iterated in the same 
> order, so it is not about early-terminating the search but more about 
> filtering-out unneeded documents more efficiently.
> If a segment is sorted in reverse order, we can work out easily the docId 
> from which documents should be collected.
> Here is a sample quick code:
> {code:title=ReverseSortingCollector.java|borderStyle=solid}
> public class ReverseSortingCollector extends FilterCollector {
>   /** Sort used to sort the search results */
>   protected final Sort sort;
>   /** Number of documents to collect in each segment */
>   protected final int numDocsToCollect;
>   
> [...]
> @Override
> public LeafCollector getLeafCollector(LeafReaderContext context) throws 
> IOException {
> LeafReader reader = context.reader();
> Sort segmentSort = reader.getIndexSort();
> if (isReverseOrder(sort, segmentSort)) {//segment is sorted in 
> reverse order than the search sort
> 
>   //Here we can easily work out the docNum from which we 
> should collect
>   long collectFrom = context.reader().numDocs() - 
> numDocsToCollect;
>   
> return new FilterLeafCollector(in.getLeafCollector(context)) {
> @Override
> public void collect(int doc) throws IOException {
> if (doc >= collectFrom) {//only delegates 
> super.collect(doc);
> }
> }
> };
> }else{
>   return in.getLeafCollector(context);
>   }
>   }
>   
> }
> {code}
> This is specially efficient when used along with TopFieldCollector as a lot 
> of docValue lookup would not take place. 
> In my experiment it reduced search time by 90%.
> However I was wondering if it is correct, as my knowledge of Lucene is still 
> quite limited.
> Especially is it correct to assume that LeafReader docId always span from 
> 0=>LeafReader.numDocs() ?
> Note : Does not support paging. Could be eventually implemented by providing 
> a way to look up the docId to match from the last document collected (eg for 
> LongPoint querying the docId closest to the previously returned value...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin closed LUCENE-7482.
---
Resolution: Invalid

Sorry, just realized that actually my implementation assumed that all documents 
match, which most of the time is not the case.


> Faster sorted index search for reverse order search
> ---
>
> Key: LUCENE-7482
> URL: https://issues.apache.org/jira/browse/LUCENE-7482
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: AMIRAULT Martin
>Priority: Minor
>
> We are currently using Lucene here in my company for our main product.
> Our search functionnality is quite basic and the results are always sorted 
> given a predefined field. The user is only able to choose the sort order 
> (Asc/Desc).
> I am currently investigating using the index sort feature with 
> EarlyTerminationSortingCollector. 
> This is quite a shame searching on a sorted index in reverse order do not 
> have any optimization and was wondering if it would be possible to make it 
> faster by creating a special "ReverseSortingCollector" for this purpose.
> I am aware the posting list is designed to be always iterated in the same 
> order, so it is not about early-terminating the search but more about 
> filtering-out unneeded documents more efficiently.
> If a segment is sorted in reverse order, we can work out easily the docId 
> from which documents should be collected.
> Here is a sample quick code:
> {code:title=ReverseSortingCollector.java|borderStyle=solid}
> public class ReverseSortingCollector extends FilterCollector {
>   /** Sort used to sort the search results */
>   protected final Sort sort;
>   /** Number of documents to collect in each segment */
>   protected final int numDocsToCollect;
>   
> [...]
> @Override
> public LeafCollector getLeafCollector(LeafReaderContext context) throws 
> IOException {
> LeafReader reader = context.reader();
> Sort segmentSort = reader.getIndexSort();
> if (isReverseOrder(sort, segmentSort)) {//segment is sorted in 
> reverse order than the search sort
> 
>   //Here we can easily work out the docNum from which we 
> should collect
>   long collectFrom = context.reader().numDocs() - 
> numDocsToCollect;
>   
> return new FilterLeafCollector(in.getLeafCollector(context)) {
> @Override
> public void collect(int doc) throws IOException {
> if (doc >= collectFrom) {//only delegates 
> super.collect(doc);
> }
> }
> };
> }else{
>   return in.getLeafCollector(context);
>   }
>   }
>   
> }
> {code}
> This is specially efficient when used along with TopFieldCollector as a lot 
> of docValue lookup would not take place. 
> In my experiment it reduced search time by 90%.
> However I was wondering if it is correct, as my knowledge of Lucene is still 
> quite limited.
> Especially is it correct to assume that LeafReader docId always span from 
> 0=>LeafReader.numDocs() ?
> Note : Does not support paging. Could be eventually implemented by providing 
> a way to look up the docId to match from the last document collected (eg for 
> LongPoint querying the docId closest to the previously returned value...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin updated LUCENE-7482:

Description: 
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

  /** Sort used to sort the search results */
  protected final Sort sort;
  /** Number of documents to collect in each segment */
  protected final int numDocsToCollect;
  
[...]

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort

//Here we can easily work out the docNum from which we 
should collect
long collectFrom = context.reader().numDocs() - 
numDocsToCollect;

return new FilterLeafCollector(in.getLeafCollector(context)) {
@Override
public void collect(int doc) throws IOException {
if (doc >= collectFrom) {//only delegates 
super.collect(doc);
}
}
};
}else{
return in.getLeafCollector(context);
}
}

}
{code}

This is specially efficient when used along with TopFieldCollector as a lot of 
docValue lookup would not take place. 
In my experiment it reduced search time by 90%.

However I was wondering if it is correct, as my knowledge of Lucene is still 
quite limited.
Especially is it correct to assume that LeafReader docId always span from 
0=>LeafReader.numDocs() ?


Note : Does not support paging. Could be eventually implemented by providing a 
way to look up the docId to match from the last document collected (eg for 
LongPoint querying the docId closest to the previously returned value...)



  was:
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

  /** Sort used to sort the search results */
  protected final Sort sort;
  /** Number of documents to collect in each segment */
  protected final int numDocsToCollect;
  
[...]

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort

//Here we can easily work out the docNum from which we 
should collect
long collectFrom = context.reader().numDocs() - 
numDocsToCollect;

return new FilterLeafCollector(in.getLeafCollector(context)) {
@Override
public void collect(int doc) throws IOException {
if (doc >= collectFrom) {//only delegates 
  

[jira] [Updated] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMIRAULT Martin updated LUCENE-7482:

Description: 
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{code:title=ReverseSortingCollector.java|borderStyle=solid}
public class ReverseSortingCollector extends FilterCollector {

  /** Sort used to sort the search results */
  protected final Sort sort;
  /** Number of documents to collect in each segment */
  protected final int numDocsToCollect;
  
[...]

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort

//Here we can easily work out the docNum from which we 
should collect
long collectFrom = context.reader().numDocs() - 
numDocsToCollect;

return new FilterLeafCollector(in.getLeafCollector(context)) {
@Override
public void collect(int doc) throws IOException {
if (doc >= collectFrom) {//only delegates 
super.collect(doc);
}
}
};
}else{
return in.getLeafCollector(context);
}
}

}
{code}

This is specially efficient when used along with TopFieldCollector as a lot of 
docValue lookup would not take place. 
In my experiment it reduced search time by 90%.

However I was wondering if it is correct, as my knowledge of Lucene is still 
quite limited.
Especially is it correct to assume that LeafReader docId always span from 
0->LeafReader.numDocs() ?


Note : Does not support paging. Could be eventually implemented by providing a 
way to look up the docId to match from the last document collected (eg for 
LongPoint querying the docId closest to the previously returned value...)



  was:
We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{quote}
public class ReverseSortingCollector extends FilterCollector {

  /** Sort used to sort the search results */
  protected final Sort sort;
  /** Number of documents to collect in each segment */
  protected final int numDocsToCollect;
  
[...]

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort

//Here we can easily work out the docNum from which we 
should collect
long collectFrom = context.reader().numDocs() - 
numDocsToCollect;

return new FilterLeafCollector(in.getLeafCollector(context)) {
@Override
public void collect(int doc) throws IOException {
if (doc >= collectFrom) {//only delegates 
super.collect(doc);
  

[jira] [Created] (LUCENE-7482) Faster sorted index search for reverse order search

2016-10-06 Thread AMIRAULT Martin (JIRA)
AMIRAULT Martin created LUCENE-7482:
---

 Summary: Faster sorted index search for reverse order search
 Key: LUCENE-7482
 URL: https://issues.apache.org/jira/browse/LUCENE-7482
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: AMIRAULT Martin
Priority: Minor


We are currently using Lucene here in my company for our main product.
Our search functionnality is quite basic and the results are always sorted 
given a predefined field. The user is only able to choose the sort order 
(Asc/Desc).

I am currently investigating using the index sort feature with 
EarlyTerminationSortingCollector. 
This is quite a shame searching on a sorted index in reverse order do not have 
any optimization and was wondering if it would be possible to make it faster by 
creating a special "ReverseSortingCollector" for this purpose.

I am aware the posting list is designed to be always iterated in the same 
order, so it is not about early-terminating the search but more about 
filtering-out unneeded documents more efficiently.

If a segment is sorted in reverse order, we can work out easily the docId from 
which documents should be collected.

Here is a sample quick code:

{quote}
public class ReverseSortingCollector extends FilterCollector {

  /** Sort used to sort the search results */
  protected final Sort sort;
  /** Number of documents to collect in each segment */
  protected final int numDocsToCollect;
  
[...]

@Override
public LeafCollector getLeafCollector(LeafReaderContext context) throws 
IOException {
LeafReader reader = context.reader();
Sort segmentSort = reader.getIndexSort();
if (isReverseOrder(sort, segmentSort)) {//segment is sorted in reverse 
order than the search sort

//Here we can easily work out the docNum from which we 
should collect
long collectFrom = context.reader().numDocs() - 
numDocsToCollect;

return new FilterLeafCollector(in.getLeafCollector(context)) {
@Override
public void collect(int doc) throws IOException {
if (doc >= collectFrom) {//only delegates 
super.collect(doc);
}
}
};
}else{
return in.getLeafCollector(context);
}
}

}
{quote}

This is specially efficient when used along with TopFieldCollector as a lot of 
docValue lookup would not take place. 
In my experiment it reduced search time by 90%.

However I was wondering if it is correct, as my knowledge of Lucene is still 
quite limited.
Especially is it correct to assume that LeafReader docId always span from 
0->LeafReader.numDocs() ?


Note : Does not support paging. Could be eventually implemented by providing a 
way to look up the docId to match from the last document collected (eg for 
LongPoint querying the docId closest to the previously returned value...)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6931) We should do a limited retry when using HttpClient.

2015-01-15 Thread Lindsay Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279206#comment-14279206
 ] 

Lindsay Martin commented on SOLR-6931:
--

Is it possible to apply this to the 4.10.x branch?  This would help us out with 
https://issues.apache.org/jira/browse/SOLR-6983

> We should do a limited retry when using HttpClient.
> ---
>
> Key: SOLR-6931
> URL: https://issues.apache.org/jira/browse/SOLR-6931
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: SOLR-6931.patch, SOLR-6931.patch, SOLR-6931.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6983) SocketExceptions no longer trigger retries when processing distributed updates

2015-01-14 Thread Lindsay Martin (JIRA)
Lindsay Martin created SOLR-6983:


 Summary: SocketExceptions no longer trigger retries when 
processing distributed updates
 Key: SOLR-6983
 URL: https://issues.apache.org/jira/browse/SOLR-6983
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Lindsay Martin


Our production Solr cluster is frequently placing replicas into 
leader-initiated recovery whenever a "java.net.SocketException: Connection 
reset" is thrown when processing distributed updates.

This problem surfaced after upgrading from Solr 4.6.1 to Solr 4.10.2. In the 
old version, a retry was attempted whenever a SocketException was encountered 
when a leader was updating a replica. After the upgrade to Solr 4.10.2, this 
retry mechanism no longer occurs.

Here is an example stacktrace:
{noformat}
2015-01-11 09:38:00.913 [updateExecutor-1-thread-35734] ERROR 
org.apache.solr.update.StreamingSolrServers  – error
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-01-11 09:38:00.917 [qtp268575911-3616964] WARN  
org.apache.solr.update.processor.DistributedUpdateProcessor  – Error sending 
update
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.

[jira] [Created] (PYLUCENE-27) JCC should be able to create sdist archives

2013-10-31 Thread Martin (JIRA)
Martin created PYLUCENE-27:
--

 Summary: JCC should be able to create sdist archives
 Key: PYLUCENE-27
 URL: https://issues.apache.org/jira/browse/PYLUCENE-27
 Project: PyLucene
  Issue Type: Wish
 Environment: jcc-svn-head
Reporter: Martin


I was not able to create a complete (in terms one is able to compile and 
install the desired wrapper) source distribution.

I've tried following calls:
  python -m jcc --jar foo  --egg-info --extra-setup-arg sdist
and
 python -m jcc --jar foo --extra-setup-arg sdist

Both create archives only containing the egg-info and setup.py but no source 
code at all.

I really need this feature for my testing environment with tox, since this 
heavily depends on the sdist feature.

thanks,
best,
Martin



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (SOLR-2790) DataImportHandler last_index_time does not update on delta-imports

2011-09-22 Thread Greg Martin (JIRA)
DataImportHandler last_index_time does not update on delta-imports
--

 Key: SOLR-2790
 URL: https://issues.apache.org/jira/browse/SOLR-2790
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.4
 Environment: Windows 7, Java version 1.6.0_26
Reporter: Greg Martin
 Fix For: 3.4


When a full-index is run using the DataImportHandler, the last_index_time is 
updated.  But it is not updated when a delta-import is run.  Same issue 
reported here: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201104.mbox/%3CBANLkTi=cunkz26aj8wcyfp7ujbjpnw6...@mail.gmail.com%3E

Note that the DataImportHandler entry on the wiki states that the 
last_index_time should update on both delta-import and full-import.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1262) DIH needs support for callable statements

2011-04-25 Thread Joachim Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024930#comment-13024930
 ] 

Joachim Martin commented on SOLR-1262:
--

I have been told by my Oracle DBA that DIH's use of Statement vs Prepared 
Statement is causing serious problems on the database side.  There is a 
performance gain by not having to re-parse a prepared statement, but more 
importantly, each Statement that needs to be re-parsed takes up space in the 
cache.  If you have repeating related entities (e.g. Author->Books[]), each 
related query is a unique statement.

Many developers, myself included, would never consider writing a database app 
without Prepared Statements for performance reasons.  I think it's even more 
important in a batch update situation where you are running N additional 
related entity queries.

I like the syntax of MyBatis' mapped statements: 

select field1, field2 from related_table where entity_id = #{id, 
jdbcType=NUMERIC}



> DIH needs support for callable statements 
> --
>
> Key: SOLR-1262
> URL: https://issues.apache.org/jira/browse/SOLR-1262
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
> Environment: linux
> mysql
>Reporter: Abdul Chaudhry
>Assignee: Noble Paul
>Priority: Minor
> Fix For: Next
>
>
> During an indexing run we noticed that we were spending a lot of time 
> creating and tearing down queries in mysql
> The queries we are using are complex and involve joins spanning across 
> multiple tables.
> We should support prepared statements in the data import handler via the 
> data-config.xml file - for those databases that support prepared statements.
> We could add a new attribute to the entity entity in dataConfig - say - 
> pquery or preparedQuery and then pass the prepared statement and have values 
> filled in by the actual queries for each row using a placeholder - like a ? 
> or something else.
> I would probably start by hacking class JdbcDataSource to try a test but was 
> wondering if anyone had experienced this or had any suggestions or if there 
> is something in the works that I missed - I couldn't find any other bugs 
> mentioning using prepared statements for performance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1656) XInclude's are resolved relative CWD, not instance dir

2010-07-14 Thread Joachim Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888541#action_12888541
 ] 

Joachim Martin edited comment on SOLR-1656 at 7/14/10 5:14 PM:
---

I'm having a similar problem with using XInclude to import a transformers.js 
script file in DataImportHandler:

http://www.w3.org/2001/XInclude";>



At runtime, the XML parser looks for this in my solr directory, not parallel to 
my db-data-config.xml (in the core/conf directory).

Having the source for script transformers in a separate js file allows me to 
use an IDE to check syntax, etc- very helpful.

[I'm assuming this is the same problem, but not sure]

  was (Author: joach...@yahoo.com):
I'm having a similar problem with using XInclude to import a 
transformers.js script file in DataImportHandler:

http://www.w3.org/2001/XInclude";>



At runtime, the XML parser looks for this in my solr directory, not parallel to 
my db-data-config.xml (in the core/conf directory).

Having the source for script transformers in a separate js file allows me to 
use an IDE to check syntax, etc- very helpful.
  
> XInclude's are resolved relative CWD, not instance dir
> --
>
> Key: SOLR-1656
> URL: https://issues.apache.org/jira/browse/SOLR-1656
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
> Attachments: 
> SOLR-1656_Support_SAX_SystemId_via_wrapping_InputStream.patch
>
>
> As noted on the mailing list, when an XInclude in a config files refrences a 
> relative path, it's resolved relative the CWD of the servlet container, and 
> not the instanceDir of the core...
>  
> http://old.nabble.com/using-Xinclude-with-multi-core-to26548400.html#a26548400

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1656) XInclude's are resolved relative CWD, not instance dir

2010-07-14 Thread Joachim Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888541#action_12888541
 ] 

Joachim Martin commented on SOLR-1656:
--

I'm having a similar problem with using XInclude to import a transformers.js 
script file in DataImportHandler:

http://www.w3.org/2001/XInclude";>



At runtime, the XML parser looks for this in my solr directory, not parallel to 
my db-data-config.xml (in the core/conf directory).

Having the source for script transformers in a separate js file allows me to 
use an IDE to check syntax, etc- very helpful.

> XInclude's are resolved relative CWD, not instance dir
> --
>
> Key: SOLR-1656
> URL: https://issues.apache.org/jira/browse/SOLR-1656
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
> Attachments: 
> SOLR-1656_Support_SAX_SystemId_via_wrapping_InputStream.patch
>
>
> As noted on the mailing list, when an XInclude in a config files refrences a 
> relative path, it's resolved relative the CWD of the servlet container, and 
> not the instanceDir of the core...
>  
> http://old.nabble.com/using-Xinclude-with-multi-core-to26548400.html#a26548400

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org