Re: Problem with simple use of DIH
I'm trying to use DataImportHandler to load my index and having some strange results. I have two tables in my database. DPRODUC contains products and FSKUMAS contains the skus related to each product. This is the data-config I'm using. dataConfig dataSource type=JdbcDataSource driver=com.ibm.as400.access.AS400JDBCDriver url=jdbc:as400:IWAVE;prompt=false;naming=system user=IPGUI password=IPGUI/ document entity name=dproduc query=select dprprd, dprdes from dproduc where dprprd like 'F%' field column=dprprd name=id / field column=dprdes name=name / entity name=fskumas query=select fsksku, fcoclr, fszsiz, fskret from fskumas where dprprd='${dproduc.DPRPRD}' field column=fsksku name=sku / field column=fcoclr name=color / field column=fszsiz name=size / field column=fskret name=price / /entity /entity /document /dataConfig What is the primary key of dproduc table? If it is dprprd can you try adding pk=dprprd to entity name=dproduc? entity name=dproduc pk=dprprd query=select dprprd, dprdes from dproduc where dprprd like 'F%'
Re: Clustering Query Solrj
On Dec 27, 2009, at 2:20 AM, Allahbaksh Asadullah wrote: How do I set Clustering true in Solrj. How do I access clustering result. I am using Solr 1.4. It isn't a client-side setting, it's a setting when launching Solr. See here for instructions... http://wiki.apache.org/solr/ClusteringComponent Erik
Re: Problem with simple use of DIH
did you run it w/o the debug? On Sun, Dec 27, 2009 at 6:31 PM, AHMET ARSLAN iori...@yahoo.com wrote: I'm trying to use DataImportHandler to load my index and having some strange results. I have two tables in my database. DPRODUC contains products and FSKUMAS contains the skus related to each product. This is the data-config I'm using. dataConfig dataSource type=JdbcDataSource driver=com.ibm.as400.access.AS400JDBCDriver url=jdbc:as400:IWAVE;prompt=false;naming=system user=IPGUI password=IPGUI/ document entity name=dproduc query=select dprprd, dprdes from dproduc where dprprd like 'F%' field column=dprprd name=id / field column=dprdes name=name / entity name=fskumas query=select fsksku, fcoclr, fszsiz, fskret from fskumas where dprprd='${dproduc.DPRPRD}' field column=fsksku name=sku / field column=fcoclr name=color / field column=fszsiz name=size / field column=fskret name=price / /entity /entity /document /dataConfig What is the primary key of dproduc table? If it is dprprd can you try adding pk=dprprd to entity name=dproduc? entity name=dproduc pk=dprprd query=select dprprd, dprdes from dproduc where dprprd like 'F%' -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Clustering Query Solrj
Hi Erik, I had set the Clustering true at server side. But I want to get the response of Clustering result through solrj. As I get Facet response can I get response of Clustering (docId, and label ) through solrj. Regards, Allahbaksh On Sun, Dec 27, 2009 at 6:58 PM, Erik Hatcher erik.hatc...@gmail.comwrote: On Dec 27, 2009, at 2:20 AM, Allahbaksh Asadullah wrote: How do I set Clustering true in Solrj. How do I access clustering result. I am using Solr 1.4. It isn't a client-side setting, it's a setting when launching Solr. See here for instructions... http://wiki.apache.org/solr/ClusteringComponent Erik -- Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronic City, Hosur Road, Bangalore 560 100, India. (Board: 91-80-28520261 | Extn: 73927 | Direct: 41173927. Fax: 91-80-28520362 | Mobile: 91-9845505322.
Re: Clustering Query Solrj
Hi Erik, I had set the Clustering true at server side. But I want to get the response of Clustering result through solrj. As I get Facet response can I get response of Clustering (docId, and label ) through solrj. By solrJ you mean EmbeddedSolrServer? If yes i think you can enable it by System.setProperty(solr.clustering.enabled, true); as a first line in your main program. Alternatively you can enable by hard coding in solrconfig.xml searchComponent name=clusteringComponent enable=true class=org.apache.solr.handler.clustering.ClusteringComponent I didnt try by myself but to query it by SolrServer you can activate it with qt parameter: ModifiableSolrParams params = new ModifiableSolrParams(); params.set(qt, /clustering); params.set(q, apple); params.set(carrot.title, myTitle); params.set(clustering, true); QueryResponse response = solr.query(params); System.out.println(response = + response); Hope this helps.
Re: Problem with simple use of DIH
I did run it without debug and the result was that 0 documents were processed. The problem seems to be with the field tags that I was using to map from the table column names to the schema.xml field names. I switched to using an AS clause in the SQL statement instead and it worked. I think the column names may be case-sensitive, although I haven't proven that to be the case. I did discover that references to column names in the velocity template are case sensitive; ${dproduc.DPRPRD} works and ${dproduc.dprprd} does not. Thanks, Jay 2009/12/27 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com did you run it w/o the debug? On Sun, Dec 27, 2009 at 6:31 PM, AHMET ARSLAN iori...@yahoo.com wrote: I'm trying to use DataImportHandler to load my index and having some strange results. I have two tables in my database. DPRODUC contains products and FSKUMAS contains the skus related to each product. This is the data-config I'm using. dataConfig dataSource type=JdbcDataSource driver=com.ibm.as400.access.AS400JDBCDriver url=jdbc:as400:IWAVE;prompt=false;naming=system user=IPGUI password=IPGUI/ document entity name=dproduc query=select dprprd, dprdes from dproduc where dprprd like 'F%' field column=dprprd name=id / field column=dprdes name=name / entity name=fskumas query=select fsksku, fcoclr, fszsiz, fskret from fskumas where dprprd='${dproduc.DPRPRD}' field column=fsksku name=sku / field column=fcoclr name=color / field column=fszsiz name=size / field column=fskret name=price / /entity /entity /document /dataConfig What is the primary key of dproduc table? If it is dprprd can you try adding pk=dprprd to entity name=dproduc? entity name=dproduc pk=dprprd query=select dprprd, dprdes from dproduc where dprprd like 'F%' -- - Noble Paul | Systems Architect| AOL | http://aol.com
Re: Clustering Query Solrj
HI Ahmet, I was looking for the same. Thanks for your early response. Warm Regards, Allahbaksh On Sun, Dec 27, 2009 at 7:22 PM, AHMET ARSLAN iori...@yahoo.com wrote: Hi Erik, I had set the Clustering true at server side. But I want to get the response of Clustering result through solrj. As I get Facet response can I get response of Clustering (docId, and label ) through solrj. By solrJ you mean EmbeddedSolrServer? If yes i think you can enable it by System.setProperty(solr.clustering.enabled, true); as a first line in your main program. Alternatively you can enable by hard coding in solrconfig.xml searchComponent name=clusteringComponent enable=true class=org.apache.solr.handler.clustering.ClusteringComponent I didnt try by myself but to query it by SolrServer you can activate it with qt parameter: ModifiableSolrParams params = new ModifiableSolrParams(); params.set(qt, /clustering); params.set(q, apple); params.set(carrot.title, myTitle); params.set(clustering, true); QueryResponse response = solr.query(params); System.out.println(response = + response); Hope this helps. -- Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronic City, Hosur Road, Bangalore 560 100, India. (Board: 91-80-28520261 | Extn: 73927 | Direct: 41173927. Fax: 91-80-28520362 | Mobile: 91-9845505322.
Re: Using solr with the new TokenStream API
Is there a way to add this jar file to classpath during tomcat startup? You can put your jar file into $CATALINA_HOME/lib directory.
Re: Enable Clustering for Solr war
Hi, I'm trying to get clustering setup for Solr 1.4 in war mode on tomcat 6. I read the instructions on the wiki, checked out the trunk and got the downloaded libraries. There's no instruction on what to do with them so I copied them to tomcat/lib directory. I set property to enable clustering in tomcat. You can put them into $solrhome/lib directory. From readme.txt lib/ This directory is optional. If it exists, Solr will load any Jars found in this directory and use them to resolve any plugins specified in your solrconfig.xml or schema.xml (ie: Analyzers, Request Handlers, etc...). I added these jars to lib and using clustering component without problems: apache-solr-clustering-1.4.0.jar jackson-mapper-asl-0.9.9-6.jar carrot2-mini-3.1.0.jarlog4j-1.2.14.jar colt-1.2.0.jarnni-1.0.0.jar commons-lang-2.4.jar pcj-1.2.jar ehcache-1.6.2.jar simple-xml-1.7.3.jar google-collections-1.0-rc2.jar jackson-core-asl-0.9.9-6.jar
Re: Enable Clustering for Solr war
Hey thanks. That worked. On Sun, 2009-12-27 at 13:12 -0800, AHMET ARSLAN wrote: apache-solr-clustering-1.4.0.jar
Re: absolute search
uhm,I am sorry, this is the debug :) lst name=debug str name=rawquerystringbook/str str name=querystringbook/str str name=parsedquery+DisjunctionMaxQuery((name:book)~0.01) ()/str str name=parsedquery_toString+(name:book)~0.01 ()/str − lst name=explain − str name=19534 7.903358 = (MATCH) sum of: 7.903358 = (MATCH) fieldWeight(name:book in 19533), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 1.0 = fieldNorm(field=name, doc=19533) /str − str name=5925 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 5924), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=5924) /str − str name=5933 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 5932), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=5932) /str − str name=8049 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 8048), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=8048) /str − str name=9358 3.951679 = (MATCH) sum of: 3.951679 = (MATCH) fieldWeight(name:book in 9357), product of: 1.0 = tf(termFreq(name:book)=1) 7.903358 = idf(docFreq=79, maxDocs=79649) 0.5 = fieldNorm(field=name, doc=9357) /str /lst str name=QParserDisMaxQParser/str null name=altquerystring/ null name=boostfuncs/ − arr name=filter_queries str/ /arr arr name=parsed_filter_queries/ − lst name=timing double name=time0.0/double − lst name=prepare double name=time0.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst − lst name=process double name=time0.0/double − lst name=org.apache.solr.handler.component.QueryComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.FacetComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.MoreLikeThisComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.HighlightComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.StatsComponent double name=time0.0/double /lst − lst name=org.apache.solr.handler.component.DebugComponent double name=time0.0/double /lst /lst /lst /lst Erick Erickson wrote: Hmmm, nothing jumps out at me. What does Luke show you is actually in your index in the field in question? And what does adding debugQuery=on to the query show? On Thu, Dec 24, 2009 at 8:44 PM, Olala hthie...@gmail.com wrote: Oh,yes, that is my schema config: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=name type=text indexed=true stored=true multiValued=true/ And, my solrconfig.xml for seach in dismax: requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.0/float str name=qf name /str str name=fl name,content /str str name=mm 100% /str int name=ps100/int str
Using IDF to find Collactions and SIPs . . ?
I am trying to write a query analyzer to pull: 1. Common phrases (also known as Collocations) with in a query 2. Highly unusual phrases (also known as Statistically Improbable Phrases or SIPs) with in a query The Collocations would be similar to facets except I am also trying to get multi word phrases as well as single terms. So suppose I could write something that does a chained query off the facet query looking for words in proximity. Conceptually (as I understand it) this should just be a question of using the IDF (inverse document frequency i.e. the measure of how often the term appears across the index). * Has anyone tried to write an analyzer that looks for the words that typically occur within a given proximity of another word? The highly unusual phrases on the other hand requires getting a handle on the IDF which at present only appears to be available via the explain function of debugging. * Has anyone written something to go directly after the IDF score only? * If I do have to go down the path of writing this from scratch is the org.apache.lucene.search.Similarity class the one to leverage? Most grateful for any feedback or insights, Christopher
RE: Using IDF to find Collactions and SIPs . . ?
I am trying to write a query analyzer to pull: 1. Common phrases (also known as Collocations) with in a query 2. Highly unusual phrases (also known as Statistically Improbable Phrases or SIPs) with in a query The Collocations would be similar to facets except I am also trying to get multi word phrases as well as single terms. So suppose I could write something that does a chained query off the facet query looking for words in proximity. Conceptually (as I understand it) this should just be a question of using the IDF (inverse document frequency i.e. the measure of how often the term appears across the index). . Has anyone tried to write an analyzer that looks for the words that typically occur within a given proximity of another word? The highly unusual phrases on the other hand requires getting a handle on the IDF which at present only appears to be available via the explain function of debugging. . Has anyone written something to go directly after the IDF score only? . If I do have to go down the path of writing this from scratch is the org.apache.lucene.search.Similarity class the one to leverage? Most grateful for any feedback or insights, Christopher
Re: Problem with simple use of DIH
The field names are case sensitive. But if the field tags are missing they are mapped to corresponding solr fields in a case insensistive way.apparently all the fields come out of you ALL CAPS you should put the 'column' values in ALL CAPS too On Sun, Dec 27, 2009 at 9:03 PM, Jay Fisher jay.l.fis...@gmail.com wrote: I did run it without debug and the result was that 0 documents were processed. The problem seems to be with the field tags that I was using to map from the table column names to the schema.xml field names. I switched to using an AS clause in the SQL statement instead and it worked. I think the column names may be case-sensitive, although I haven't proven that to be the case. I did discover that references to column names in the velocity template are case sensitive; ${dproduc.DPRPRD} works and ${dproduc.dprprd} does not. Thanks, Jay 2009/12/27 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com did you run it w/o the debug? On Sun, Dec 27, 2009 at 6:31 PM, AHMET ARSLAN iori...@yahoo.com wrote: I'm trying to use DataImportHandler to load my index and having some strange results. I have two tables in my database. DPRODUC contains products and FSKUMAS contains the skus related to each product. This is the data-config I'm using. dataConfig dataSource type=JdbcDataSource driver=com.ibm.as400.access.AS400JDBCDriver url=jdbc:as400:IWAVE;prompt=false;naming=system user=IPGUI password=IPGUI/ document entity name=dproduc query=select dprprd, dprdes from dproduc where dprprd like 'F%' field column=dprprd name=id / field column=dprdes name=name / entity name=fskumas query=select fsksku, fcoclr, fszsiz, fskret from fskumas where dprprd='${dproduc.DPRPRD}' field column=fsksku name=sku / field column=fcoclr name=color / field column=fszsiz name=size / field column=fskret name=price / /entity /entity /document /dataConfig What is the primary key of dproduc table? If it is dprprd can you try adding pk=dprprd to entity name=dproduc? entity name=dproduc pk=dprprd query=select dprprd, dprdes from dproduc where dprprd like 'F%' -- - Noble Paul | Systems Architect| AOL | http://aol.com -- - Noble Paul | Systems Architect| AOL | http://aol.com