Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Mon, Jun 2, 2008 at 11:15 PM, Sean Timm <[EMAIL PROTECTED]> wrote:
> It seems that the DisMaxRequestHandler tries hard to handle any query that
> the user can throw at it.

That's exactly why I was reporting it... :-)


 - Bram


Re: dismax query parser crash on double dash

2008-06-03 Thread Grant Ingersoll

+1.  Fault tolerance good.  ParseExceptions bad.

Can you open a JIRA issue for it?  If you feel you see the problem, a  
patch would be great, too.


-Grant

On Jun 2, 2008, at 5:15 PM, Sean Timm wrote:

It seems that the DisMaxRequestHandler tries hard to handle any  
query that the user can throw at it.


From http://wiki.apache.org/solr/DisMaxRequestHandler:
"Quotes can be used to group phrases, and +/- can be used to denote  
mandatory and optional clauses ... but all other Lucene query parser  
special characters are escaped to simplify the user experience.  The  
handler takes responsibility for building a good query from the  
user's input" [...] "any query containing an odd number of quote  
characters is evaluated as if there were no quote characters at all."


Would it be outside the scope of the DisMaxRequestHandler to also  
handle improper use of +/-?  There are a couple of other cases where  
a user query could fail to parse.  Basically they all boil down to a  
+ or - operator not being followed by a term.  A few examples of  
queries that fail:


chocolate cookie -
chocolate -+cookie
chocolate --cookie
chocolate - - cookie

-Sean

Grant Ingersoll wrote:

See http://wiki.apache.org/solr/DisMaxRequestHandler

Namely, "-" is the prohibited operator, thus, -- really is  
meaningless.  You either need to escape them or remove them


-Grant

On Jun 2, 2008, at 7:14 AM, Bram de Jong wrote:


hello all,


just a small note to say that the dismax query parser crashes on:

q = "apple -- pear"

I'm running through a stored batch of my users' searches and it went
down on the double dash :)


- Bram

--
http://freesound.iua.upf.edu
http://www.smartelectronix.com
http://www.musicdsp.org


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: dismax query parser crash on double dash

2008-06-03 Thread Sean Timm
I can take a stab at this.  I need to see why SOLR-502 isn't working for 
Otis first though.


-Sean

Bram de Jong wrote:

On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
  

+1.  Fault tolerance good.  ParseExceptions bad.

Can you open a JIRA issue for it?  If you feel you see the problem, a patch
would be great, too.



https://issues.apache.org/jira/browse/SOLR-589

I hope the bug report is detailed enough.
As I have no experience whatsoever with Java, me writing a patch would
be a Bad Idea (TM)


 - Bram
  


Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> +1.  Fault tolerance good.  ParseExceptions bad.
>
> Can you open a JIRA issue for it?  If you feel you see the problem, a patch
> would be great, too.

https://issues.apache.org/jira/browse/SOLR-589

I hope the bug report is detailed enough.
As I have no experience whatsoever with Java, me writing a patch would
be a Bad Idea (TM)


 - Bram


Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm <[EMAIL PROTECTED]> wrote:
> I can take a stab at this.  I need to see why SOLR-502 isn't working for
> Otis first though.

I slightly "enhanced" my script so it would only do the strange
searches my users have done in the past... (i.e things with more than
just numbers and letters), and I found two more:

" and ""

i.e. one double quote and two double quotes

I'll add it to the ticket.

 - bram


RE: Issuing queries during analysis?

2008-06-03 Thread Dallan Quass
> Grant Ingersoll wrote:
> 
> How often does your collection change or get updated?
> 
> You could also have a slight alternative, which is to create 
> a real small and simple Lucene index that contains your 
> translations and then do it pre-indexing.  The code for such 
> a searcher is quite simple, albeit it isn't Solr.
> 
> Otherwise, you'd have to hack the SolrResourceLoader to 
> recognize your Analyzer as being SolrCoreAware, but, geez, I 
> don't know what the full ramifications of that would be, so 
> caveat emptor.


> Mike Klaas wrote:
>
> Perhaps you could separate the problem, putting this info in 
> separate index or solr core.

This sounds like the best approach.  I've written a special searcher that
handles standardization requests for multiple places in one http call and it
was pretty straightforward.  That's what I love about SOLR, it's *so* easy
to write plugins for.

Thank-you for your suggestions!

--dallan



Ideas on how to implement "sponsored results"

2008-06-03 Thread climbingrose
Hi all,

I'm trying to implement "sponsored results" in Solr search results similar
to that of Google. We index products from various sites and would like to
allow certain sites to promote their products. My approach is to query a
slave instance to get sponsored results for user queries in addition to the
normal search results. This part is easy. However, since the number of
products indexed for each sites can be very different (100, 1000, 1 or
6 products), we need a way to fairly distribute the sponsored results
among sites.

My initial thought is utilising field collapsing patch to collapse the
search results on siteId field. You can imagine that this will create a
series of "buckets of results", each bucket representing results from a
site. After that, 2 or 3 buckets will randomly be selected from which I will
randomly select one or two results from. However, since I want these
sponsored results to be relevant to user queries, I'd like only want to have
the first 30 results in each buckets.

Obviously, it's desirable that if the user refreshes the page, new sponsored
results will be displayed. On the other hand, I also want to have the
advantages of Solr cache.

What would be the best way to implement this functionality? Thanks.

Cheers,
Cuong


Re: Ideas on how to implement "sponsored results"

2008-06-03 Thread Alexander Ramos Jardim
Cuong,

I have implemented sponsored words for a client. I don't know if my working
can help you but I will expose it and let you decide.

I have an index containing products entries that I created a field called
sponsored words. What I do is to boost this field , so when these words are
matched in the query that products appear first on my result.

2008/6/3 climbingrose <[EMAIL PROTECTED]>:

> Hi all,
>
> I'm trying to implement "sponsored results" in Solr search results similar
> to that of Google. We index products from various sites and would like to
> allow certain sites to promote their products. My approach is to query a
> slave instance to get sponsored results for user queries in addition to the
> normal search results. This part is easy. However, since the number of
> products indexed for each sites can be very different (100, 1000, 1 or
> 6 products), we need a way to fairly distribute the sponsored results
> among sites.
>
> My initial thought is utilising field collapsing patch to collapse the
> search results on siteId field. You can imagine that this will create a
> series of "buckets of results", each bucket representing results from a
> site. After that, 2 or 3 buckets will randomly be selected from which I
> will
> randomly select one or two results from. However, since I want these
> sponsored results to be relevant to user queries, I'd like only want to
> have
> the first 30 results in each buckets.
>
> Obviously, it's desirable that if the user refreshes the page, new
> sponsored
> results will be displayed. On the other hand, I also want to have the
> advantages of Solr cache.
>
> What would be the best way to implement this functionality? Thanks.
>
> Cheers,
> Cuong
>



-- 
Alexander Ramos Jardim


sp.dictionary.threshold parm of spell checker seems unresponsive

2008-06-03 Thread Ronald K. Braun
I'm playing around with the spell checker on 1.3 nightly build and
don't see any effect on changes to the "sp.dictionary.threshold" in
terms of dictionary size.  A value of 0.0 seems to create a dictionary
of the same size and content as a value of 0.9.  (I'd expect a very
small dictionary in the latter case.)  I think sp.dictionary.threshold
is a float parameter, but maybe I'm misunderstanding?

And just to be sure, I assume I can alter this parameter prior to
issue the "rebuild" command to build the dictionary -- I don't need to
reindex termSourceField between changes?

My solrconfig.xml has this definition for the handler:



30
0.5

spell
dictionary
0.9


And schema.xml in case that is somehow relevant:
















Any advice?  I'd definitely like to tighten up the dictionary but it
appears to always include terms regardless of their frequency in the
source content.

Thanks,

Ron


Problems using multicore

2008-06-03 Thread Alexander Ramos Jardim
Hello,

I am getting problems running Solr-1.3-trunk with multicores.

My multicore.xml file is:





I have solr.home pointing the directory containing it.

All the involved directories exist. There's a conf directory containing the
schema.xml and solrconfig.xml of each core in their respective core
directories.

When I try to run solr I get:
An error occurred during activation of changes, please see the log for
details.  [image: Message icon - Error] [HTTP:101216]Servlet: "SolrServer"
failed to preload on startup in Web application: "apache-solr-1.3-dev.war".
org.apache.solr.common.SolrException: error creating core at
org.apache.solr.core.SolrCore.getSolrCore(SolrCore.java:306) at
org.apache.solr.servlet.SolrServlet.init(SolrServlet.java:46) at
javax.servlet.GenericServlet.init(GenericServlet.java:241) at
weblogic.servlet.internal.StubSecurityHelper$ServletInitAction.run(StubSecurityHelper.java:282)
at
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(Unknown Source) at
weblogic.servlet.internal.StubSecurityHelper.createServlet(StubSecurityHelper.java:63)
at
weblogic.servlet.internal.StubLifecycleHelper.createOneInstance(StubLifecycleHelper.java:58)
at
weblogic.servlet.internal.StubLifecycleHelper.(StubLifecycleHelper.java:48)
at
weblogic.servlet.internal.ServletStubImpl.prepareServlet(ServletStubImpl.java:507)
at
weblogic.servlet.internal.WebAppServletContext.preloadServlet(WebAppServletContext.java:1853)
at
weblogic.servlet.internal.WebAppServletContext.loadServletsOnStartup(WebAppServletContext.java:1830)
at
weblogic.servlet.internal.WebAppServletContext.preloadResources(WebAppServletContext.java:1750)
at
weblogic.servlet.internal.WebAppServletContext.start(WebAppServletContext.java:2909)
at
weblogic.servlet.internal.WebAppModule.startContexts(WebAppModule.java:973)
at weblogic.servlet.internal.WebAppModule.start(WebAppModule.java:361) at
weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60)
at
weblogic.application.internal.flow.ScopedModuleDriver.start(ScopedModuleDriver.java:200)
at
weblogic.application.internal.flow.ModuleListenerInvoker.start(ModuleListenerInvoker.java:117)
at
weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60)
at
weblogic.application.internal.flow.StartModulesFlow.activate(StartModulesFlow.java:27)
at
weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:635)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:212)
at
weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:154)
at
weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:80)
at
weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:566)
at
weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:136)
at
weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:104)
at
weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:320)
at
weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:816)
at
weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1223)
at
weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:434)
at
weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:161)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:181)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:12)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:67)
at
weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:464)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200) at
weblogic.work.ExecuteThread.run(ExecuteThread.java:172) Caused by:
java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or '/var/opt/subacatalog/core/conf/',
cwd=/home/bea/bea102/wlserver_10.0/samples/domains/wl_server at
org.apache.solr.core.SolrResourceLoader.openResource(SolrRes

Re: sp.dictionary.threshold parm of spell checker seems unresponsive

2008-06-03 Thread Otis Gospodnetic
Ron,

It might be better for you to look at SOLR-572 issue in Solr's JIRA and use the 
patch provided there with the Solr trunk.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Ronald K. Braun <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 3, 2008 1:29:01 PM
> Subject: sp.dictionary.threshold parm of spell checker seems unresponsive
> 
> I'm playing around with the spell checker on 1.3 nightly build and
> don't see any effect on changes to the "sp.dictionary.threshold" in
> terms of dictionary size.  A value of 0.0 seems to create a dictionary
> of the same size and content as a value of 0.9.  (I'd expect a very
> small dictionary in the latter case.)  I think sp.dictionary.threshold
> is a float parameter, but maybe I'm misunderstanding?
> 
> And just to be sure, I assume I can alter this parameter prior to
> issue the "rebuild" command to build the dictionary -- I don't need to
> reindex termSourceField between changes?
> 
> My solrconfig.xml has this definition for the handler:
> 
> 
> class="solr.SpellCheckerRequestHandler" startup="lazy">
> 
> 30
> 0.5
> 
> spell
> dictionary
> 0.9
> 
> 
> And schema.xml in case that is somehow relevant:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> multiValued="true" omitNorms="true" />
> 
> Any advice?  I'd definitely like to tighten up the dictionary but it
> appears to always include terms regardless of their frequency in the
> source content.
> 
> Thanks,
> 
> Ron



Re: dismax query parser crash on double dash

2008-06-03 Thread Otis Gospodnetic
Bram,
You will slowly discover various characters and tokens that "don't work" with 
DisMax.  They "don't work" because they are "special" - they are a part of the 
query grammar and have special meanings.  Have you tried escaping those 
characters in your application before sending the query to Solr?  Escaping is 
done with backward slashes.  I bet that's in the Lucene FAQ on Lucene's Wiki.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Bram de Jong <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 3, 2008 11:15:06 AM
> Subject: Re: dismax query parser crash on double dash
> 
> On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm wrote:
> > I can take a stab at this.  I need to see why SOLR-502 isn't working for
> > Otis first though.
> 
> I slightly "enhanced" my script so it would only do the strange
> searches my users have done in the past... (i.e things with more than
> just numbers and letters), and I found two more:
> 
> " and ""
> 
> i.e. one double quote and two double quotes
> 
> I'll add it to the ticket.
> 
> - bram



Solrj + Multicore

2008-06-03 Thread Alexander Ramos Jardim
Is there a way to access a specific core via Solrj.
Sorry but I couldn't find anything on wiki or google.

-- 
Alexander Ramos Jardim


Re: Solrj + Multicore

2008-06-03 Thread Erik Hatcher


On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote:

Is there a way to access a specific core via Solrj


Yes, depending on which SolrServer implementation:

  SolrServer server = new CommonsHttpSolrServer("http://localhost:8983/solr/ 
")


-or-

  SolrServer server = new EmbeddedSolrServer(solrCore)

Erik




Re: Solrj + Multicore

2008-06-03 Thread Alexander Ramos Jardim
Well,

This way I connect to my server
new CommonsHttpSolrServer("http://localhost:8983/solr/?core=idxItem";)

This way I don't connect:
new CommonsHttpSolrServer("http://localhost:8983/solr/idxItem";)

As you can obviously see, I can't use the first way because it produces
wrong requests like
http://localhost:8983/solr/?core=idxItem/update?wt=xml&version=2.2

and I end up getting exceptions like these.

org.apache.solr.common.SolrException: Can_not_find_core_idxItemupdatewtxml

Can_not_find_core_idxItemupdatewtxml

request: http://localhost:8983/solr/?core=idxItem/update?wt=xml&version=2.2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:308)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:152)
at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:220)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:51)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:55)
...

I would like to remeber that I am using solr-1.3 trunk


2008/6/3 Erik Hatcher <[EMAIL PROTECTED]>:

>
> On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote:
>
>> Is there a way to access a specific core via Solrj
>>
>
> Yes, depending on which SolrServer implementation:
>
>  SolrServer server = new CommonsHttpSolrServer("
> http://localhost:8983/solr/")
>
> -or-
>
>  SolrServer server = new EmbeddedSolrServer(solrCore)
>
>Erik
>
>
>


-- 
Alexander Ramos Jardim


Re: solr slave configuration help

2008-06-03 Thread Gaku Mak

Hi Yonik and others,

We ended up using adding 2 additional GB to physical ram (total of 4GB now)
and set the java heap to 3GB, so OS should have 1GB to play with.  The slave
servers are now a lot more responsive, even during replication and with
autowarm turned on (not too aggressive though).  

It also seems like using the serialGC makes the server more responsive
during the time of replication (registering new searcher) unless I'm missing
something in the GC configuration.  

Thanks!

-Gaku


Yonik Seeley wrote:
> 
> On Sun, Jun 1, 2008 at 5:20 AM, Gaku Mak <[EMAIL PROTECTED]> wrote:
> [...]
>> I also have some test script to query against the slave server; however,
>> whenever during snapinstall, OOM would occur and the server is not very
>> responsive (even with autowarm disabled).  After a while (like couple
>> minutes), the server can respond again.  Is this expected?
> 
> Not really expected, no.
> Is the server unresponsive to a single search request (i.e. it takes a
> long time to complete)?
> Are you load testing, or just trying single requests?
> 
>> I have set the heap size to 1.5GB out of the 2GB physical ram.  Any help
>> is
>> appreciated.  Thanks!
> 
> Try a smaller heap.
> The OS needs memory to cache the Lucene index structures too (Lucene
> does very little caching and depends on the OS to do it for good
> performance).
> 
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/solr-slave-configuration-help-tp17583642p17636257.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Julio Castillo
Hi Noble,
I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
But that didn't make a difference.
It still complained about the "Document [null] missing required field: id"
for each row it ran into of the outer entity.

If you look at the debug output of the entity:pets (see below on original
message).
The query looks like this:
"SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'

This is the problem lies, because, the owner_id in the pets table is
currently a number and thus will not match the modified combo id generated
for the owners' id column.

So, somehow, I need to be able to either remove the 'owners-' suffix before
comparing, or append the same suffix to the pets.owner_id value prior to
comparing.

Thanks

** julio

-Original Message-
From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 02, 2008 9:20 PM
To: solr-user@lucene.apache.org
Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?

hi Julio,
delete my previous response. In your schema , 'id' is the uniqueKey.
make  'comboid' the unique key. Because that is the target field name coming
out of the entity 'owners'

--Noble

On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
<[EMAIL PROTECTED]> wrote:
> The field 'id' is repeated for pet also rename it to something else 
> say query="SELECT id,name,birth_date,type_id FROM pets WHERE 
> owner_id='${owners.id}'"
>   parentDeltaQuery="SELECT id FROM owners WHERE 
> id=${pets.owner_id}">
>   
> 
>
> --Noble
>
> On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo <[EMAIL PROTECTED]>
wrote:
>> Shalin,
>> I experimented with it, and the null pointer exception has been taken 
>> care of. Thank you.
>>
>> I have a different problem now. I believe it is a 
>> syntax/specification problem.
>>
>> When importing data, I got the following exceptions:
>> SEVERE: Exception while adding:
>> SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
>> userName=userName(1.0)={[David, Schroeder]}}]
>>
>> org.apache.solr.common.SolrException: Document [null] missing 
>> required
>> field: id
>>at
>>
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
>>at
>> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
>> ortHand
>> ler.java:263)
>>...
>>
>> The problem arises the moment I try to include nested entities (e.g. 
>> pets -the problem does not occur if I don't use the transformer, but 
>> I have to use the transformer because other unrelated entities also have
id's).
>> My data config file looks as follows.
>>
>> 
>>  
>>>query="select id,first_name,last_name FROM owners"
>>transformer="TemplateTransformer">
>>
>>
>>
>>
>>>query="SELECT id,name,birth_date,type_id FROM pets 
>> WHERE owner_id='${owners.id}'"
>>parentDeltaQuery="SELECT id FROM owners WHERE 
>> id=${pets.owner_id}">
>>
>>
>>
>>
>>
>>  
>> 
>>
>> The debug output of the data import looks as follows:
>>
>> 
>> - 
>>  - 
>>- 
>>  select id,first_name,last_name FROM owners
>>  0:0:0.15
>>  --- row #1-
>>  1
>>  George
>>  Franklin
>>  -
>>  - 
>>-
>>owners-1
>>George
>>Franklin
>>-
>>- 
>>  SELECT id,name,birth_date,type_id FROM 
>> pets WHERE owner_id='owners-1'
>>  0:0:0.0
>>  
>>  
>>  
>> + 
>> 
>>
>> Thanks again
>>
>> ** julio
>>
>>
>> -Original Message-
>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED]
>> Sent: Saturday, May 31, 2008 10:26 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: How to describe 2 entities in dataConfig for the
DataImporter?
>>
>> Hi Julio,
>>
>> I've fixed the bug, can you please replace the exiting 
>> TemplateTransformer.java in the SOLR-469.patch and use the attached 
>> TemplateTransformer.java file. We'll add the changes to our next patch.
>> Sorry for all the trouble.
>>
>> On Sat, May 31, 2008 at 10:31 PM, Noble Paul ??? ??
>> <[EMAIL PROTECTED]> wrote:
>>> julio,
>>> Looks like it is a bug.
>>> We can give u a new TemplateTransformer.java which we will 
>>> incorporate in the next patch --Noble
>>>
>>> On Sat, May 31, 2008 at 12:24 AM, Julio Castillo 
>>> <[EMAIL PROTECTED]> wrote:
 I'm sorry Shalin, but I still get the same Null Pointer exception.
 This is my complete dataconfig.xml (I remove the parallel entity to 
 narrow down the scope of the problem).
 
  
>>>query="select id as idAlias,first_name,last_name FROM vets"
deltaQuery="SELECT id as idAlias FROM vets WHERE 
 last_modified > '${dataimporter.last_index_time}'"

Re: Ideas on how to implement "sponsored results"

2008-06-03 Thread climbingrose
Hi Alexander,

Thanks for your suggestion. I think my problem is a bit different from
yours. We don't have any sponsored words but we have to retrieve sponsored
results directly from the index. This is because a site can have 60,000
products which is hard to insert/update keywords. I can live with that by
issuing a separate query to fetch sponsored results. My problem is to
equally distribute sponsored results between sites so that each site will
have an opportunity to show their sponsored results no matter how many
products they have. For example, if site A has 6 products, site B has
only 2000 then sponsored products from site B will have a very small chance
to be displayed.


On Wed, Jun 4, 2008 at 2:56 AM, Alexander Ramos Jardim <
[EMAIL PROTECTED]> wrote:

> Cuong,
>
> I have implemented sponsored words for a client. I don't know if my working
> can help you but I will expose it and let you decide.
>
> I have an index containing products entries that I created a field called
> sponsored words. What I do is to boost this field , so when these words are
> matched in the query that products appear first on my result.
>
> 2008/6/3 climbingrose <[EMAIL PROTECTED]>:
>
> > Hi all,
> >
> > I'm trying to implement "sponsored results" in Solr search results
> similar
> > to that of Google. We index products from various sites and would like to
> > allow certain sites to promote their products. My approach is to query a
> > slave instance to get sponsored results for user queries in addition to
> the
> > normal search results. This part is easy. However, since the number of
> > products indexed for each sites can be very different (100, 1000, 1
> or
> > 6 products), we need a way to fairly distribute the sponsored results
> > among sites.
> >
> > My initial thought is utilising field collapsing patch to collapse the
> > search results on siteId field. You can imagine that this will create a
> > series of "buckets of results", each bucket representing results from a
> > site. After that, 2 or 3 buckets will randomly be selected from which I
> > will
> > randomly select one or two results from. However, since I want these
> > sponsored results to be relevant to user queries, I'd like only want to
> > have
> > the first 30 results in each buckets.
> >
> > Obviously, it's desirable that if the user refreshes the page, new
> > sponsored
> > results will be displayed. On the other hand, I also want to have the
> > advantages of Solr cache.
> >
> > What would be the best way to implement this functionality? Thanks.
> >
> > Cheers,
> > Cuong
> >
>
>
>
> --
> Alexander Ramos Jardim
>



-- 
Regards,

Cuong Hoang


Re: Solrj + Multicore

2008-06-03 Thread Ryan McKinley



This way I don't connect:
new CommonsHttpSolrServer("http://localhost:8983/solr/idxItem";)



this is how you need to connect... otherwise nothing will work.

Perhaps we should throw an exception if you initialize a URL that  
contains "?"


ryan



Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi julio,
You must create an extra field for 'comboid' because you really need
the 'id' for your sub-entities. Your data-config must look as follows.
The pet also has a field called 'id' . It is not a good idea. call it
'petid' or something (both in dataconfig and schema.xml). Please make
sure that the field names are unique .



   
   
   
   

   
   
   
   
   
   


On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo <[EMAIL PROTECTED]> wrote:
> Hi Noble,
> I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
> But that didn't make a difference.
> It still complained about the "Document [null] missing required field: id"
> for each row it ran into of the outer entity.
>
> If you look at the debug output of the entity:pets (see below on original
> message).
> The query looks like this:
> "SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'
>
> This is the problem lies, because, the owner_id in the pets table is
> currently a number and thus will not match the modified combo id generated
> for the owners' id column.
>
> So, somehow, I need to be able to either remove the 'owners-' suffix before
> comparing, or append the same suffix to the pets.owner_id value prior to
> comparing.
>
> Thanks
>
> ** julio
>
> -Original Message-
> From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED]
> Sent: Monday, June 02, 2008 9:20 PM
> To: solr-user@lucene.apache.org
> Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?
>
> hi Julio,
> delete my previous response. In your schema , 'id' is the uniqueKey.
> make  'comboid' the unique key. Because that is the target field name coming
> out of the entity 'owners'
>
> --Noble
>
> On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
> <[EMAIL PROTECTED]> wrote:
>> The field 'id' is repeated for pet also rename it to something else
>> say  >   query="SELECT id,name,birth_date,type_id FROM pets WHERE
>> owner_id='${owners.id}'"
>>   parentDeltaQuery="SELECT id FROM owners WHERE
>> id=${pets.owner_id}">
>>   
>> 
>>
>> --Noble
>>
>> On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo <[EMAIL PROTECTED]>
> wrote:
>>> Shalin,
>>> I experimented with it, and the null pointer exception has been taken
>>> care of. Thank you.
>>>
>>> I have a different problem now. I believe it is a
>>> syntax/specification problem.
>>>
>>> When importing data, I got the following exceptions:
>>> SEVERE: Exception while adding:
>>> SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
>>> userName=userName(1.0)={[David, Schroeder]}}]
>>>
>>> org.apache.solr.common.SolrException: Document [null] missing
>>> required
>>> field: id
>>>at
>>>
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
>>>at
>>> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
>>> ortHand
>>> ler.java:263)
>>>...
>>>
>>> The problem arises the moment I try to include nested entities (e.g.
>>> pets -the problem does not occur if I don't use the transformer, but
>>> I have to use the transformer because other unrelated entities also have
> id's).
>>> My data config file looks as follows.
>>>
>>> 
>>>  
>>>>>query="select id,first_name,last_name FROM owners"
>>>transformer="TemplateTransformer">
>>> template="owners-${owners.id}"/>
>>>
>>>
>>>
>>>>>query="SELECT id,name,birth_date,type_id FROM pets
>>> WHERE owner_id='${owners.id}'"
>>>parentDeltaQuery="SELECT id FROM owners WHERE
>>> id=${pets.owner_id}">
>>>
>>>
>>>
>>>
>>>
>>>  
>>> 
>>>
>>> The debug output of the data import looks as follows:
>>>
>>> 
>>> - 
>>>  - 
>>>- 
>>>  select id,first_name,last_name FROM owners
>>>  0:0:0.15
>>>  --- row #1-
>>>  1
>>>  George
>>>  Franklin
>>>  -
>>>  - 
>>>-
>>>owners-1
>>>George
>>>Franklin
>>>-
>>>- 
>>>  SELECT id,name,birth_date,type_id FROM
>>> pets WHERE owner_id='owners-1'
>>>  0:0:0.0
>>>  
>>>  
>>>  
>>> + 
>>> 
>>>
>>> Thanks again
>>>
>>> ** julio
>>>
>>>
>>> -Original Message-
>>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED]
>>> Sent: Saturday, May 31, 2008 10:26 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: How to describe 2 entities in dataConfig for the
> DataImporter?
>>>
>>> Hi Julio,
>>>
>>> I've fixed the bug, can you please replace the exiting
>>> TemplateTransformer.java in the SOLR-469.patch and use the attached
>>> TemplateTransformer.java file. We'll add the changes to our next patch.
>>> Sorry for all the trouble.
>>>
>>> On Sat, May 31, 2008 at 10:31 PM, Noble

Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
The id in pet should be  aliased to 'petid' , because id is coming
from both entities there is a conflict

  
  
  
  

  
  
  
  
  
  


On Wed, Jun 4, 2008 at 10:37 AM, Noble Paul നോബിള്‍ नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> hi julio,
> You must create an extra field for 'comboid' because you really need
> the 'id' for your sub-entities. Your data-config must look as follows.
> The pet also has a field called 'id' . It is not a good idea. call it
> 'petid' or something (both in dataconfig and schema.xml). Please make
> sure that the field names are unique .
>
>
>query="select id,first_name,last_name FROM owners"
>   transformer="TemplateTransformer">
>   
>   
>   
>   
>
>  query="SELECT id,name,birth_date,type_id FROM pets WHERE
> owner_id='${owners.id}'"
>   parentDeltaQuery="SELECT id FROM owners WHERE
> id=${pets.owner_id}">
>   
>   
>   
>   
>   
>
>
> On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo <[EMAIL PROTECTED]> wrote:
>> Hi Noble,
>> I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
>> But that didn't make a difference.
>> It still complained about the "Document [null] missing required field: id"
>> for each row it ran into of the outer entity.
>>
>> If you look at the debug output of the entity:pets (see below on original
>> message).
>> The query looks like this:
>> "SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'
>>
>> This is the problem lies, because, the owner_id in the pets table is
>> currently a number and thus will not match the modified combo id generated
>> for the owners' id column.
>>
>> So, somehow, I need to be able to either remove the 'owners-' suffix before
>> comparing, or append the same suffix to the pets.owner_id value prior to
>> comparing.
>>
>> Thanks
>>
>> ** julio
>>
>> -Original Message-
>> From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED]
>> Sent: Monday, June 02, 2008 9:20 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?
>>
>> hi Julio,
>> delete my previous response. In your schema , 'id' is the uniqueKey.
>> make  'comboid' the unique key. Because that is the target field name coming
>> out of the entity 'owners'
>>
>> --Noble
>>
>> On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
>> <[EMAIL PROTECTED]> wrote:
>>> The field 'id' is repeated for pet also rename it to something else
>>> say  >>   query="SELECT id,name,birth_date,type_id FROM pets WHERE
>>> owner_id='${owners.id}'"
>>>   parentDeltaQuery="SELECT id FROM owners WHERE
>>> id=${pets.owner_id}">
>>>   
>>> 
>>>
>>> --Noble
>>>
>>> On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo <[EMAIL PROTECTED]>
>> wrote:
 Shalin,
 I experimented with it, and the null pointer exception has been taken
 care of. Thank you.

 I have a different problem now. I believe it is a
 syntax/specification problem.

 When importing data, I got the following exceptions:
 SEVERE: Exception while adding:
 SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
 userName=userName(1.0)={[David, Schroeder]}}]

 org.apache.solr.common.SolrException: Document [null] missing
 required
 field: id
at

>> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
at
 org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
 ortHand
 ler.java:263)
...

 The problem arises the moment I try to include nested entities (e.g.
 pets -the problem does not occur if I don't use the transformer, but
 I have to use the transformer because other unrelated entities also have
>> id's).
 My data config file looks as follows.

 
  
>>>query="select id,first_name,last_name FROM owners"
transformer="TemplateTransformer">
> template="owners-${owners.id}"/>



>>>query="SELECT id,name,birth_date,type_id FROM pets
 WHERE owner_id='${owners.id}'"
parentDeltaQuery="SELECT id FROM owners WHERE
 id=${pets.owner_id}">





  
 

 The debug output of the data import looks as follows:

 
 - 
  - 
- 
  select id,first_name,last_name FROM owners
  0:0:0.15
  --- row #1-
  1
  George
  Franklin
  -
  - 
-
owners-1
George
Franklin
-
- 
  SELECT id,name,birth_