Re: Embedded SOLR - Commit issue

2007-12-01 Thread Ryan McKinley

Sunny Bassan wrote:

I have implemented the embedded SOLR approach for indexing of database
records. I am indexing approximately 10 millions records, querying and
indexing 20,000 records at a time. Each record is added to the
updateHandler via the updateHandler.addDoc() function once all 20,000
records have been added a commit() call is made. Issue is that after
this commit call I don't see any changes to the index. Even after an
optimize call, which is called after all records have been added to the
index, I don't see any changes to the index. In order to see changes I
have to manually bounce the SOLR webapp in Tomcat. Is this the designed
functionality? If not what can I do in order to see changes after commit
calls are made? Thanks.
 


How are you searching?  using standard HTTP search?  If so, just send 
your commit with the regular  message.


If you aren't using standard HTTP searching, it is a bit difficult to 
say what you are doing wrong.  (EmbeddedSolr is powerful, but it leaves 
you a lot of rope)


ryan



DynamicField and FacetFields..

2007-12-01 Thread Jeryl Cook
Question:
I need to dynamically data to SOLR, so I do not have a "predefined" list of 
field names...

so i use the dynamicField option in the schma and match approioate datatype..
in my schema.xml






Then programatically my code
...
document.addField( dynamicFieldName + "_s",  dynamicFieldValue, 10 ); 
facetFieldNames.put( dynamicFieldName + "_s",null);//TODO:use copyField..
server.add( document,true );
server.commit();

when i attempt to graph results, i want to display 
SolrQuery query = new SolrQuery();
query.setQuery( "*:*" );
query.setFacetLimit(10);//TODO:
Iterator facetsIt = facetFieldNames.entrySet().iterator();
while(facetsIt.hasNext()){
Entryentry = (Entry)facetsIt.next();
String facetName = (String)entry.getKey();
query.addFacetField(facetName);
}
 
QueryResponse rsp;
   
rsp = server.query( query );
   List facetFieldList = rsp.getFacetFields(); 
   assertNotNull(facetFieldList);

   


my facetFieldList is null, of course if i addFacetField if "id" it 
works..because i define it in the schema.xml

is this just a something that is not implemented? or am i missing something...

Thanks.



Jeryl Cook 



/^\ Pharaoh /^\ 

http://pharaohofkush.blogspot.com/ 



"..Act your age, and not your shoe size.."

-Prince(1986)

> Date: Fri, 30 Nov 2007 21:23:59 -0500
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Highlighting, word index
> 
> It's good you already have the data because if you somehow got it from
> some sort of calculations I'd have to tell my product manager that
> the feature he wanted that I told him couldn't be done with our data
> was possible after all ...
> 
> About page breaks:
> 
> Another approach to paging is to index a special page token with an
> increment of 0 from the last word of the page. Say you have the following:
> last ctrl-l first. Then index last, $$$ at an increment of 0 then first.
> 
> You can then quite quickly calculate the pages by using
> termdocs/termenum on your special token and count.
> 
> Which approach you use depends upon whether you want span and/or
> phrase queries to match across page boundaries. If you use an increment as
> Mike suggests, matching "last first"~3 won't work. It just depends upon
> whether how you want to match across the page break.
> 
> Best
> Erick
> 
> On Nov 30, 2007 4:37 PM, Mike Klaas <[EMAIL PROTECTED]> wrote:
> 
> > On 30-Nov-07, at 1:02 PM, Owens, Martin wrote:
> >
> > >
> > > Hello everyone,
> > >
> > > We're working to replace the old Linux version of dtSearch with
> > > Lucene/Solr, using the http requests for our perl side and java for
> > > the indexing.
> > >
> > > The functionality that is causing the most problems is the
> > > highlighting since we're not storing the text in solr (only
> > > indexing) and we need to highlight an image file (ocr) so what we
> > > really need is to request from solr the word indexes of the
> > > matches, we then tie this up to the ocr image and create html boxes
> > > to do the highlighting.
> >
> > This isn't possible with Solr out-of-the-box.  Also, the usual
> > methods for highlighting won't work because Solr typically re-
> > analyzes the raw text to find the appropriate highlighting points.
> > However, it shouldn't be too hard to come up with a custom solution.
> > You can tell lucene to store token offsets using TermVectors
> > (configurable via schema.xml).  Then you can customize the request
> > handler to return the token offsets (and/or positions) by retrieving
> > the TVs.
> >
> > > The text is also multi page, each page is seperated by Ctrl-L page
> > > breaks, should we handle the paging out selves or can Solr tell use
> > > which page the match happened on too?
> >
> > Again, not automatically.  However, if you wrote an analyzer that
> > bumped up the position increment of tokens every time a new page was
> > found (to, say the next multiple of 1000), then you infer the
> > matching page by the token position.
> >
> > cheers,
> > -Mike
> >


RE: DynamicField and FacetFields..

2007-12-01 Thread Jeryl Cook
fixed, i had a typo...may want to delete my post( i want to :P .)

Jeryl Cook  
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> Subject: DynamicField and  FacetFields..
> Date: Sat, 1 Dec 2007 14:21:12 -0500
> 
> Question:
> I need to dynamically data to SOLR, so I do not have a "predefined" list of 
> field names...
> 
> so i use the dynamicField option in the schma and match approioate datatype..
> in my schema.xml
> 
> 
> 
> 
> 
> 
> Then programatically my code
> ...
> document.addField( dynamicFieldName + "_s",  dynamicFieldValue, 10 ); 
> facetFieldNames.put( dynamicFieldName + "_s",null);//TODO:use copyField..
> server.add( document,true );
> server.commit();
> 
> when i attempt to graph results, i want to display 
> SolrQuery query = new SolrQuery();
> query.setQuery( "*:*" );
> query.setFacetLimit(10);//TODO:
> Iterator facetsIt = facetFieldNames.entrySet().iterator();
> while(facetsIt.hasNext()){
> Entryentry = (Entry)facetsIt.next();
> String facetName = (String)entry.getKey();
> query.addFacetField(facetName);
> }
>  
> QueryResponse rsp;
>
> rsp = server.query( query );
>List facetFieldList = rsp.getFacetFields(); 
>assertNotNull(facetFieldList);
> 
>
> 
> 
> my facetFieldList is null, of course if i addFacetField if "id" it 
> works..because i define it in the schema.xml
> 
> is this just a something that is not implemented? or am i missing something...
> 
> Thanks.
> 
> 
> 
> Jeryl Cook 
> 
> 
> 
> /^\ Pharaoh /^\ 
> 
> http://pharaohofkush.blogspot.com/ 
> 
> 
> 
> "..Act your age, and not your shoe size.."
> 
> -Prince(1986)
> 
> > Date: Fri, 30 Nov 2007 21:23:59 -0500
> > From: [EMAIL PROTECTED]
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr Highlighting, word index
> > 
> > It's good you already have the data because if you somehow got it from
> > some sort of calculations I'd have to tell my product manager that
> > the feature he wanted that I told him couldn't be done with our data
> > was possible after all ...
> > 
> > About page breaks:
> > 
> > Another approach to paging is to index a special page token with an
> > increment of 0 from the last word of the page. Say you have the following:
> > last ctrl-l first. Then index last, $$$ at an increment of 0 then first.
> > 
> > You can then quite quickly calculate the pages by using
> > termdocs/termenum on your special token and count.
> > 
> > Which approach you use depends upon whether you want span and/or
> > phrase queries to match across page boundaries. If you use an increment as
> > Mike suggests, matching "last first"~3 won't work. It just depends upon
> > whether how you want to match across the page break.
> > 
> > Best
> > Erick
> > 
> > On Nov 30, 2007 4:37 PM, Mike Klaas <[EMAIL PROTECTED]> wrote:
> > 
> > > On 30-Nov-07, at 1:02 PM, Owens, Martin wrote:
> > >
> > > >
> > > > Hello everyone,
> > > >
> > > > We're working to replace the old Linux version of dtSearch with
> > > > Lucene/Solr, using the http requests for our perl side and java for
> > > > the indexing.
> > > >
> > > > The functionality that is causing the most problems is the
> > > > highlighting since we're not storing the text in solr (only
> > > > indexing) and we need to highlight an image file (ocr) so what we
> > > > really need is to request from solr the word indexes of the
> > > > matches, we then tie this up to the ocr image and create html boxes
> > > > to do the highlighting.
> > >
> > > This isn't possible with Solr out-of-the-box.  Also, the usual
> > > methods for highlighting won't work because Solr typically re-
> > > analyzes the raw text to find the appropriate highlighting points.
> > > However, it shouldn't be too hard to come up with a custom solution.
> > > You can tell lucene to store token offsets using TermVectors
> > > (configurable via schema.xml).  Then you can customize the request
> > > handler to return the token offsets (and/or positions) by retrieving
> > > the TVs.
> > >
> > > > The text is also multi page, each page is seperated by Ctrl-L page
> > > > breaks, should we handle the paging out selves or can Solr tell use
> > > > which page the match happened on too?
> > >
> > > Again, not automatically.  However, if you wrote an analyzer that
> > > bumped up the position increment of tokens every time a new page was
> > > found (to, say the next multiple of 1000), then you infer the
> > > matching page by the token position.
> > >
> > > cheers,
> > > -Mike
> > >


solr ubuntu and tomcat

2007-12-01 Thread Yousef Ourabi
Has anyone else had any trouble running solr on ubuntu with the apt installed 
tomcat? (Not a download from apache.org)

I'm having a bear of a time.

On Debian Etch I managed to get Solr working by setting TOMCAT_SECURITY=no in 
/etc/default/tomcat5.5 

The same solr.xml (Context) on Ubuntu fails with the NoClassDefFoundError as if 
I had not set solr/home in the context -- however both SOLR/HOME and the 
docBase are correct (ie they are there, owned by tomcat55.adm with liberal 
permissions 777)

Any thoughts? Anyone else have a similar experience?

Thanks.
Yousef