Re: Intuition check

2007-11-10 Thread Yonik Seeley
On Nov 10, 2007 8:48 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote:
> : I've since considered trying out a SortedIntSet since they would be
> : both smaller, and usable in skipTo.
>
> If you think it's worth doing, then it probably is.

It's complicated...
- SortedIntSet would be about 44% smaller on average
- random lookups would be significantly slower (but it's unclear how
many random lookups need to be done)
- intersection of 2 SortedIntSets of near equal size should be slightly faster
- intersection of 2 SortedIntSets of different sizes will be slower
- intersection of a SortedIntSet with a BitSet will be slightly faster
- intersection of a small uncached int list with a large SortedIntSet
will be slower (think facet.enum.cache.minDf)

Anyway, this is just something to keep in mind...  there are bigger
fish to fry right now.

-Yonik


Re: Intuition check

2007-11-10 Thread Chris Hostetter

: >   2) "here is an 'fq'" ... in which we get the DocSet and add it to the
: >  main query if it's small.
: 
: One issue is that HashDocSet would need to first be sorted, but that
: should hopefully be relatively quick.

Oh...  hmm, yeah i thought DocIterators already garunteed order.  but 
either way, the DocSets we are dealing with should be small, so hopefully 
the sorting is fast.

: I've since considered trying out a SortedIntSet since they would be
: both smaller, and usable in skipTo.

If you think it's worth doing, then it probably is. 

(over the years i've seen more then enough evidence that "yonik with his 
performance hat on" is just as much of an expert in his field as "yonik 
with his thread safty hat on")



-Hoss



Re: Intuition check

2007-11-10 Thread Yonik Seeley
On Nov 8, 2007 7:34 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>   2) "here is an 'fq'" ... in which we get the DocSet and add it to the
>  main query if it's small.

One issue is that HashDocSet would need to first be sorted, but that
should hopefully be relatively quick.
Background: HashDocSet was developed at a time when Lucene didn't
deliver docs in sorted order anyway... it would have required extra
time to sort, and lookups would be slower (binary search vs hash).
I've since considered trying out a SortedIntSet since they would be
both smaller, and usable in skipTo.

-Yonik


[jira] Updated: (SOLR-386) Add confuguration to specify SolrHighlighter implementation

2007-11-10 Thread Tricia Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tricia Williams updated SOLR-386:
-

Attachment: SOLR-386-SolrHighlighter.patch

Updated patch to work with recent changes made to SolrCore.  Should apply 
against a clean trunk again.  No further changes.

> Add confuguration to specify SolrHighlighter implementation
> ---
>
> Key: SOLR-386
> URL: https://issues.apache.org/jira/browse/SOLR-386
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 1.3
>Reporter: Eli Levine
> Attachments: SOLR-386-SolrHighlighter.patch, 
> SOLR-386-SolrHighlighter.patch
>
>
> It would be great if SolrCore allowed the highlighter class to be 
> configurable.  A good way would be to add a +class+ attribute to the 
>  element in solrconfig.xml, similar to how the RequestHandler 
> instance is initialized in SorCore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Post-SOLR215/SOLR350 singleton issue

2007-11-10 Thread Ryan McKinley


perhaps.  it depends largely on what the long term goals of multi-core
support are ... if we're striving for "dynamicly" creating "solr contexts" 
independently of "servlet contexts" then the current behavior is probably 
ideal ... we're protecting the plugins of each SolrCore from 
corrupting eachother.


agreed.  In the current setup, the only classes shared across all 
SolrCore instances are the RequestDispatcher and the MultiCore registry. 
 If you reload a core, all the static initialization stuff should also 
be reloaded.


*perhaps* adding a 'context' to the MultiCore registry would be a good 
way to let cores communicate with one another.


ryan


Re: SolrConfig.Initializable

2007-11-10 Thread Chris Hostetter

: dooh - I was hoping Chris may take this one up as he started the thread on
: changing the init method :)
: 
: If this is not an issue for Chris, I can take a look sometime in the next few

sorry ... i didn't mean to suggest that i had a specific need to change 
the method ... i was approaching it more from the "because of multi-core, 
the init signature and semantics will change from 1.2 to 1.3; so what do 
we *really* want init methods to look like moving forward?" (as opposed to 
the init signature currently in the trunk that was made to be the minimum 
neccessary changes to support both multi-cores and the existing factories)


-Hoss



Re: Post-SOLR215/SOLR350 singleton issue

2007-11-10 Thread Chris Hostetter

: I don't know if this applies in your case but there is now one class loader
: per solr core instance (more exactly per config; it used to be a static 
: before).

: If your connection pool is a static member of a core handler, each core will
: create the handler through different class loaders - thus different
: instances of the static member.

i didn't understand the orriginal issue, but based on Ryan's reply I think 
Henrib is right on the money.

: Internally, it could be better (aka easier to use) to create one class
: loader per instance dir instead of per core.

perhaps.  it depends largely on what the long term goals of multi-core
support are ... if we're striving for "dynamicly" creating "solr contexts" 
independently of "servlet contexts" then the current behavior is probably 
ideal ... we're protecting the plugins of each SolrCore from 
corrupting eachother.

if however we want SolrCores to be able to "communicate" and interact with 
eachother ... well, then a class loader per solr webapp mmight make sense.

honestly: i'm not sure i understand what "instance dir" means in a 
multi-core world ... do all cores share an instance dir?  how do configs 
and dataDirs work in that case?


-Hoss



[jira] Commented: (SOLR-284) Parsing Rich Document Types

2007-11-10 Thread Juri Kuehn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541535
 ] 

Juri Kuehn commented on SOLR-284:
-

Hi Eric, thank you for this handler, works like a charm!
I need to use non-numeric ids which are fine with solr but are rejected by 
RichDocumentRequestHandler. I'm not familiar with the solr-code, i patched 
RichDocumentRequestHandler.java to not to convert id to int, which didn't cause 
trouble so far:

{code:title=RichDocumentRequestHandler.java.patch}
Index: RichDocumentRequestHandler.java
===
--- RichDocumentRequestHandler.java (revision 0)
+++ RichDocumentRequestHandler.java (working copy)
@@ -133,7 +133,7 @@
String streamFieldname;
String[] fieldnames;
SchemaField[] fields;
-   int id;
+   String id;
  
final AddUpdateCommand templateAdd;
 
@@ -153,7 +153,7 @@
String fn = params.get(FIELDNAMES);
fieldnames = fn != null ? commaSplit.split(fn,-1) : null;

-   id = params.getInt(ID);
+   id = params.get(ID);
 
templateAdd = new AddUpdateCommand();
templateAdd.allowDups = false;
@@ -202,7 +202,7 @@
 * @param desc
 *TODO
 */
-   void doAdd(int id, String text, DocumentBuilder builder, 
AddUpdateCommand template)
+   void doAdd(String id, String text, DocumentBuilder builder, 
AddUpdateCommand template)
throws IOException {
 
  // first, create the lucene document
@@ -225,7 +225,7 @@
  handler.addDoc(template);
}
 
-   void addDoc(int id, String text) throws IOException {
+   void addDoc(String id, String text) throws IOException {
templateAdd.indexedId = null;
doAdd(id, text, builder, templateAdd);
}
{code}

Tests were ok, maybe you can apply it to your sources.

Best regards,
Juri

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Eric Pugh
> Fix For: 1.3
>
> Attachments: libs.zip, rich.patch, source.zip, test-files.zip, 
> test.zip
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Post-SOLR215/SOLR350 singleton issue

2007-11-10 Thread Henrib

I don't know if this applies in your case but there is now one class loader
per solr core instance (more exactly per config; it used to be a static 
before).
If your connection pool is a static member of a core handler, each core will
create the handler through different class loaders - thus different
instances of the static member.
Which raises the question of "how do I share objects between handlers
created by different cores then?" - and I dont have an answer to that yet...
Internally, it could be better (aka easier to use) to create one class
loader per instance dir instead of per core.
Hope this helps
Henrib


Walter Ferrara-2 wrote:
> 
> I'm currently using latest solr + SOLR350 in a multicore solr.
> I've a ConnectionPool, which use a singleton design, it has been tested
> and it really works as singleton.
> Every cores have an handler, responsible for update that will eventually
> use that connection pool.
> 
> The problem is: every core re-instance the connectionpool (this seems to
> happen with a test singleton too), like what would happen if every core
> reside in a different JVMs. This happens when the handler is called (not
> inited), at different time period; inside the same core, it work as
> singleton, but others cores, when called, will simple make a new ones.
> 
> Is there a way to bypass that, and to use singletons in a "singleton"
> fashion between cores?
> Btw: Cores do see MultiCore as a singleton, i.e. all handler see the
> same object, so I could get around this problem using MultiCore to get
> cores..., but I would like to understand why cores doesn't seems to
> share singletons. I may be wrong, but this used to work with (old)
> Henri's solr215 patch.
> 
> Thanks,
> Walter
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Post-SOLR215-SOLR350-singleton-issue-tf4776980.html#a13680116
Sent from the Solr - Dev mailing list archive at Nabble.com.