DIH delta import - last modified date

2010-01-19 Thread Yao Ge

I am struggling with the concept of delta import in DIH. According the to
documentation, the delta import will automatically record the last index
time stamp and make it available to use for the delta query. However in many
case when the last_modified date time stamp in the database lag behind the
current time, the last index time stamp is the not good for delta query. Can
I pick a different mechanism to generate last_index_time by using time
stamp computed from the database (such as from a column of the database)?
-- 
View this message in context: 
http://old.nabble.com/DIH-delta-import---last-modified-date-tp27231449p27231449.html
Sent from the Solr - User mailing list archive at Nabble.com.



DIH - Export to XML

2009-10-30 Thread Yao Ge

For Data Import Handler, there is a way to dump data to a SOLR feed format
XML file?
-- 
View this message in context: 
http://old.nabble.com/DIH---Export-to-XML-tp26138213p26138213.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Google Side-By-Side UI

2009-10-02 Thread Yao Ge

Yes. I think would be very helpful tool for tunning search relevancy - you
can do a controlled experiment with your target audiences to understand
their responses to the parameter changes. We plan to use this feature to
benchmark Lucene/SOLR against our in-house commercial search engine - it
will be an interesting test.


Lance Norskog-2 wrote:
 
 http://googleenterprise.blogspot.com/2009/08/compare-enterprise-search-relevance.html
 
 This is really cool, and a version for Solr would help in doing
 relevance experiments. We don't need the select A or B feature, just
 seeing search result sets side-by-side would be great.
 
 -- 
 Lance Norskog
 goks...@gmail.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Google-Side-By-Side-UI-tp25719087p25719806.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Item Facet

2009-08-07 Thread Yao Ge

Are your product_name* fields numeric fields (integer or float)? 


Dals wrote:
 
 Hi...
 
 Is there any way to group values like shopping.yahoo.com or
 shopper.cnet.com do?
 
 For instance, I have documents like:
 
 doc1 - product_name1 - value1
 doc2 - product_name1 - value2
 doc3 - product_name1 - value3
 doc4 - product_name2 - value4
 doc5 - product_name2 - value5
 doc6 - product_name2 - value6
 
 I'd like to have a result grouping by product name with the value
 range per product. Something like:
 
 product_name1 - (value1 to value3)
 product_name2 - (value4 to value6)
 
 It is not like the current facet because the information is grouped by
 item, not the entire result.
 
 Any idea?
 
 Thanks!
 
 David Lojudice Sobrinho
 
 

-- 
View this message in context: 
http://www.nabble.com/Item-Facet-tp24853669p24865535.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Limiting facets for huge data - setting indexed=false in schema.xml

2009-07-31 Thread Yao Ge

Having a large number of fields is not the same as having a large number of
facets. To facets are something you would display to users as aid for query
refinement or navigation. There is no way for a user to use 3700 facets at
the same time. So it more of question on how to determine what facets to
fetch on search time based on the user's actions or based on certain
predefined configurations. I have written an application with 30 some
facetable fields on millions of records, I also ran into the issue of
calculate all facets as the server resources as limited to number of caches
available and CPU cycles available for facet calculations. I then realize
why display all these facet regardless user want to see them or not? I have
then change to approach to only fetch minimum set of facets by default and
make the rest of facets fields open on-demand (using AJAX). I was able to
dramatically increase the response time by spreading the facet loading
overtime. There are still issues of total facet caches when you have a large
number available facets, but you need realistically evaluate what does it
means to a user to have large number of facet. I don't think on typical user
interface having more than 10 filters showing at the same time will be any
more effective than having a small number of filters to begin with and
progressive showing more on-demand (hierarchical facets?)


Rahul R wrote:
 
 Hello,
 We are trying to get Solr to work for a really huge parts database.
 Details
 of the database
 - 55 million parts
 - Totally 3700 properties (facets). But each record will not have value
 for
 all properties.
 - Most of these facets are defined as dynamic fields within the Solr Index
 
 We were getting really unacceptable timing while doing faceting/searches
 on
 an index created with this database. With only one user using the system,
 query times are in excess of 1 minute. With more users concurrently using
 the system, the response times are further high.
 
 We thought that by limiting the number of properties that are available
 for
 faceting, the performance can be improved. To test this, we enabled only 6
 properties for faceting by setting indexed=true (in schema.xml) for only
 these properties. All other properties which are defined as dynamic
 properties had indexed=false. The observations after this change :
 
 - Index size reduced by a meagre 5 % only
 - Performance did not improve. Infact during PSR run we observed that it
 degraded.
 
 My questions:
  - Will reducing the number of facets improve faceting and search
 performance ?
 - Is there a better way to reduce the number of facets ?
 - Will having a large number of properties defined as dynamic fields,
 reduce
 performance ?
 
 Thank you.
 
 Regards
 Rahul
 
 

-- 
View this message in context: 
http://www.nabble.com/Limiting-facets-for-huge-data---setting-indexed%3Dfalse-in-schema.xml-tp24751763p24761778.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr's MLT query call doesn't work

2009-07-08 Thread Yao Ge

A couple of things, your mlt.fl value, must be part of fl. In this case,
content_mlt is not included in fl.
I think the fl parameter value need to be comma separated. try
fl=title,author,content_mlt,score

-Yao

SergeyG wrote:
 
 Hi,
 
 Recently, while implementing the MoreLikeThis search, I've run into the
 situation when Solr's mlt query calls don't work. 
 
 More specifically, the following query:
 
 http://localhost:8080/solr/select?q=id:10mlt=truemlt.fl=content_mltmlt.maxqt=
 5mlt.interestingTerms=detailsfl=title+author+score
 
 brings back just the doc with id=10 and nothing else. While using the
 GetMethod approach (putting /mlt explicitely into the url), I got back
 some results.
 
 I've been trying to solve this problem for more than a week with no luck.
 If anybody has any hint, please help.
 
 Below, I put logs  outputs from 3 runs: a) Solr; b) GetMethod (/mlt); c)
 GetMethod (/select).
 
 Thanks a lot.
 
 Regards,
 Sergey Goldberg
 
 
 Here're the logs: 
 
 a) Solr (http://localhost:8080/solr/select)
 08.07.2009 15:50:33 org.apache.solr.core.SolrCore execute
 INFO: [] webapp=/solr path=/select
 params={fl=title+author+scoremlt.fl=content_mltq=id:10mlt=
 truemlt.interestingTerms=detailsmlt.maxqt=5wt=javabinversion=2.2}
 hits=1 status=0 QTime=172
 
 INFO MLTSearchRequestProcessor:49 - SolrServer url:
 http://localhost:8080/solr
 INFO MLTSearchRequestProcessor:67 - solrQuery
 q=id%3A10mlt=truemlt.fl=content_mltmlt.maxqt=
   5mlt.interestingTerms=detailsfl=title+author+score
 INFO MLTSearchRequestProcessor:73 - Number of docs found = 1
 INFO MLTSearchRequestProcessor:77 - title = SG_Book; score = 2.098612
 
 
 b) GetMethod (http://localhost:8080/solr/mlt)
 08.07.2009 16:55:44 org.apache.solr.core.SolrCore execute
 INFO: [] webapp=/solr path=/mlt
 params={fl=title+author+scoremlt.fl=content_mltq=id:10mlt.max
 qt=5mlt.interestingTerms=details} status=0 QTime=15
 
 INFO MLT2SearchRequestProcessor:76 - ?xml version=1.0
 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime0/int/lstresult name=match numFound=1 start=0
 maxScore=2.098612docfloat name=score2.098612/floatarr name=
 authorstrS.G./str/arrstr
 name=titleSG_Book/str/doc/resultresult name=response n
 umFound=4 start=0 maxScore=0.28923997docfloat
 name=score0.28923997/floatarr name=authorstrO.
 Henry/strstrS.G./str/arrstr name=titleFour Million,
 The/str/docdocfloat name=score0.08667877/floatarr
 name=authorstrKatherine Mosby/str/arrstr name=titleThe Season
 of Lillian Dawes/str/docdocfloat
 name=score0.07947738/floatarr name=authorstrJerome K.
 Jerome/str/arrstr name=titleThree Men in a
 Boat/str/docdocfloat 
 name=score0.047219563/floatarr name=authorstrCharles
 Oliver/strstrS.G./str/arrstr name=titleABC's of
 Science/str/doc/resultlst name=interestingTermsfloat
 name=content_mlt:ye1.0/floatfloat
 name=content_mlt:tobin1.0/floatfloat
 name=content_mlt:a1.0/floatfloat
 name=content_mlt:i1.0/floatfloat
 name=content_mlt:his1.0/float/lst
 /response
 
 
 c) GetMethod (http://localhost:8080/solr/select)
 08.07.2009 17:06:45 org.apache.solr.core.SolrCore execute
 INFO: [] webapp=/solr path=/select
 params={fl=title+author+scoremlt.fl=content_mltq=id:10mlt.
 maxqt=5mlt.interestingTerms=details} hits=1 status=0 QTime=16
 
 INFO MLT2SearchRequestProcessor:80 - ?xml version=1.0
 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime16/intlst name=paramsstr name=fltitle author
 score/strstr name=mlt.flcontent_mlt/strstr
 name=qid:10/strstr name=mlt.maxqt5/strstr
 name=mlt.interestingTermsdetails/str/lst/lstresult
 name=response numFound=1 start=0 maxScore=2.098612docfloat
 name=score2.098612/floatarr name=authorstrS.G./str/arrstr
 name=titleSG_Book/str/doc/resultlst name=debugstr
 name=rawquerystringid:10/strstr name=querystringid:10/strstr
 name=parsedq
 ueryid:10/strstr name=parsedquery_toStringid:10/strlst
 name=explainstr name=10
 2.098612 = (MATCH) weight(id:10 in 3), product of:
   0.9994 = queryWeight(id:10), product of:
 2.0986123 = idf(docFreq=1, numDocs=5)
 0.47650534 = queryNorm
   2.0986123 = (MATCH) fieldWeight(id:10 in 3), product of:
 1.0 = tf(termFreq(id:10)=1)
 2.0986123 = idf(docFreq=1, numDocs=5)
 1.0 = fieldNorm(field=id, doc=3)
 /str/lststr name=QParserOldLuceneQParser/strlst
 name=timingdouble name=time16.0/doublelst name=preparedouble
 name=time0.0/doublelst
 name=org.apache.solr.handler.component.QueryComponentdouble
 name=time0.0/double/lstlst
 name=org.apache.solr.handler.component.FacetComponentdouble
 name=time0.0/double/lstlst name=org.apache.solr.handler.component
 .MoreLikeThisComponentdouble name=time0.0/double/lstlst
 name=org.apache.solr.handler.component.HighlightComponentdouble
 name=time0.0/double/lstlst
 name=org.apache.solr.handler.component.DebugComponentdouble
 name=time0.0/double/lst/lstlst name=processdouble
 name=time16.0/doublelst
 name=org.apache.solr.handler.component.QueryComponentdouble
 name=time0.0/double/lstlst
 

Re: Filtering MoreLikeThis results

2009-07-07 Thread Yao Ge

I am not sure about the parameters for MLT the requestHandler plugin. Can one
of you share the solrconfig.xml entry for MLT? Thanks in advance.
-Yao


Bill Au wrote:
 
 I have been using the StandardRequestHandler (ie /solr/select).  fq does
 work with the MoreLikeThisHandler.  I will switch to use that.  Thanks.
 
 Bill
 
 On Tue, Jul 7, 2009 at 11:02 AM, Marc Sturlese
 marc.sturl...@gmail.comwrote:
 

 At least in trunk, if you request for:
 http://localhost:8084/solr/core_A/mlt?q=id:7468365fq=price[100http://localhost:8084/solr/core_A/mlt?q=id:7468365fq=price%5B100TO
 200]
 It will filter the MoreLikeThis results


 Bill Au wrote:
 
  I think fq only works on the main response, not the mlt matches.  I
 found
  a
  couple of releated jira:
 
  http://issues.apache.org/jira/browse/SOLR-295
  http://issues.apache.org/jira/browse/SOLR-281
 
  If I am reading them correctly, I should be able to use DIsMax and
  MoreLikeThis together.  I will give that a try and report back.
 
  Bill
 
 
  On Tue, Jul 7, 2009 at 4:45 AM, Marc Sturlese
  marc.sturl...@gmail.comwrote:
 
 
  Using MoreLikeThisHandler you can use fq to filter your results. As
 far
  as
  I
  know bq are not allowed.
 
 
  Bill Au wrote:
  
   I have been trying to restrict MoreLikeThis results without any luck
  also.
   In additional to restricting the results, I am also looking to
  influence
   the
   scores similar to the way boost query (bq) works in the
   DisMaxRequestHandler.
  
   I think Solr's MoreLikeThis depends on Lucene's contrib queries
   MoreLikeThis, or at least it used to.  Has anyone looked into
 enhancing
   Solrs' MoreLikeThis to support bq and restricting mlt results?
  
   Bill
  
   On Mon, Jul 6, 2009 at 2:16 PM, Yao Ge yao...@gmail.com wrote:
  
  
   I could not find any support from
   http://wiki.apache.org/solr/MoreLikeThison
   how to restrict MLT results to certain subsets. I passed along a fq
   parameter and it is ignored. Since we can not incorporate the
 filters
  in
   the
   query itself which is used to retrieve the target for similarity
   comparison,
   it appears there is no way to filter MLT results. BTW. I am using
 Solr
   1.3.
   Please let me know if there is way (other than hacking the source
  code)
   to
   do this. Thanks!
   --
   View this message in context:
  
 
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24369257.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24374996.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24377360.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Filtering MoreLikeThis results

2009-07-07 Thread Yao Ge

The answer to my owner question:
  ...
  requestHandler name=mlt class=solr.MoreLikeThisHandler
lst name=defaults/
  /requestHandler
  ...

would work.
-Yao


Yao Ge wrote:
 
 I am not sure about the parameters for MLT the requestHandler plugin. Can
 one of you share the solrconfig.xml entry for MLT? Thanks in advance.
 -Yao
 
 
 Bill Au wrote:
 
 I have been using the StandardRequestHandler (ie /solr/select).  fq does
 work with the MoreLikeThisHandler.  I will switch to use that.  Thanks.
 
 Bill
 
 On Tue, Jul 7, 2009 at 11:02 AM, Marc Sturlese
 marc.sturl...@gmail.comwrote:
 

 At least in trunk, if you request for:
 http://localhost:8084/solr/core_A/mlt?q=id:7468365fq=price[100http://localhost:8084/solr/core_A/mlt?q=id:7468365fq=price%5B100TO
 200]
 It will filter the MoreLikeThis results


 Bill Au wrote:
 
  I think fq only works on the main response, not the mlt matches.  I
 found
  a
  couple of releated jira:
 
  http://issues.apache.org/jira/browse/SOLR-295
  http://issues.apache.org/jira/browse/SOLR-281
 
  If I am reading them correctly, I should be able to use DIsMax and
  MoreLikeThis together.  I will give that a try and report back.
 
  Bill
 
 
  On Tue, Jul 7, 2009 at 4:45 AM, Marc Sturlese
  marc.sturl...@gmail.comwrote:
 
 
  Using MoreLikeThisHandler you can use fq to filter your results. As
 far
  as
  I
  know bq are not allowed.
 
 
  Bill Au wrote:
  
   I have been trying to restrict MoreLikeThis results without any
 luck
  also.
   In additional to restricting the results, I am also looking to
  influence
   the
   scores similar to the way boost query (bq) works in the
   DisMaxRequestHandler.
  
   I think Solr's MoreLikeThis depends on Lucene's contrib queries
   MoreLikeThis, or at least it used to.  Has anyone looked into
 enhancing
   Solrs' MoreLikeThis to support bq and restricting mlt results?
  
   Bill
  
   On Mon, Jul 6, 2009 at 2:16 PM, Yao Ge yao...@gmail.com wrote:
  
  
   I could not find any support from
   http://wiki.apache.org/solr/MoreLikeThison
   how to restrict MLT results to certain subsets. I passed along a
 fq
   parameter and it is ignored. Since we can not incorporate the
 filters
  in
   the
   query itself which is used to retrieve the target for similarity
   comparison,
   it appears there is no way to filter MLT results. BTW. I am using
 Solr
   1.3.
   Please let me know if there is way (other than hacking the source
  code)
   to
   do this. Thanks!
   --
   View this message in context:
  
 
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24369257.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24374996.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24380408.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting with MoreLikeThis

2009-07-07 Thread Yao Ge

Faceting on MLT request the use of MoreLikeThisHandler. The standard request
handler, while provide support to MLT via a search component, does not
return facets on MLT results. To enable MLT handler, add an entry like below
to your solrconfig.xml

  requestHandler name=mlt class=solr.MoreLikeThisHandler
lst name=defaults/
  /requestHandler

The query parameters syntax for faceting remains the same as standard
request handler.

-Yao


Yao Ge wrote:
 
 Does Solr support faceting on MoreLikeThis search results?
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24380459.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: A big question about Solr and SolrJ range query ?

2009-07-07 Thread Yao Ge

use Solr's Filter Query parameter fq:
fq=x:[10 TO 100]fq=y:[20 TO 300]fl=title

-Yao

huenzhao wrote:
 
 Hi all:
 
 Suppose that my index have 3 fields: title, x and y.
 
 I know one range(10  x  100) can query liks this: 
 
 http://localhost:8983/solr/select?q=x:[10 TO 100]fl=title
 
 If I want to two range(10  x 100 AND 20  y  300) query like 
 
 SQL(select title where x10 and x  100 and y  20 and y  300) 
 
 by using Solr range query or SolrJ, but not know how to implement. Anybody
 know ? Thanks
 
 Email: enzhao...@gmail.com
 
 

-- 
View this message in context: 
http://www.nabble.com/A-big-question-about-Solr-and-SolrJ-range-query---tp24384416p24384540.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: about defaultSearchField

2009-07-07 Thread Yao Ge

Try with fl=* or fl=*,score added to your request string.
-Yao

Yang Lin-2 wrote:
 
 Hi,
 I have some problems.
 For my solr progame, I want to type only the Query String and get all
 field
 result that includ the Query String. But now I can't get any result
 without
 specified field. For example, query with tina get nothing, but
 Sentence:tina could.
 
 I hava adjusted the *schema.xml* like this:
 
 fields
field name=CategoryNamePolarity type=text indexed=true
 stored=true multiValued=true/
field name=CategoryNameStrenth type=text indexed=true
 stored=true multiValued=true/
field name=CategoryNameSubjectivity type=text indexed=true
 stored=true multiValued=true/
field name=Sentence type=text indexed=true stored=true
 multiValued=true/

field name=allText type=text indexed=true stored=true
 multiValued=true/
 /fields

 uniqueKey required=falseSentence/uniqueKey

  !-- field for the QueryParser to use when an explicit fieldname is
 absent
 --
  defaultSearchFieldallText/defaultSearchField

  !-- SolrQueryParser configuration: defaultOperator=AND|OR --
  solrQueryParser defaultOperator=OR/

 copyfield source=CategoryNamePolarity dest=allText/
 copyfield source=CategoryNameStrenth dest=allText/
 copyfield source=CategoryNameSubjectivity dest=allText/
 copyfield source=Sentence dest=allText/
 
 
 I think the problem is in defaultSearchField, but I don't know how to
 fix
 it. Could anyone help me?
 
 Thanks
 Yang
 
 

-- 
View this message in context: 
http://www.nabble.com/about-defaultSearchField-tp24382105p24384615.html
Sent from the Solr - User mailing list archive at Nabble.com.



Faceting with MoreLikeThis

2009-07-06 Thread Yao Ge

Does Solr support faceting on MoreLikeThis search results?
-- 
View this message in context: 
http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24356166.html
Sent from the Solr - User mailing list archive at Nabble.com.



Filtering MoreLikeThis results

2009-07-06 Thread Yao Ge

I could not find any support from http://wiki.apache.org/solr/MoreLikeThis on
how to restrict MLT results to certain subsets. I passed along a fq
parameter and it is ignored. Since we can not incorporate the filters in the
query itself which is used to retrieve the target for similarity comparison,
it appears there is no way to filter MLT results. BTW. I am using Solr 1.3. 
Please let me know if there is way (other than hacking the source code) to
do this. Thanks!
-- 
View this message in context: 
http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Filter fq with OR operator

2009-06-26 Thread Yao Ge

I will like to submit a JIRA issue for this. Can anyone help me on where to
go?
-Yao


Otis Gospodnetic wrote:
 
 
 Brian,
 
 Opening a JIRA issue if it doesn't already exist is the best way.  If you
 can provide a patch, even better!
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: brian519 bpear...@desire2learn.com
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 16, 2009 1:32:41 PM
 Subject: Re: Query Filter fq with OR operator
 
 
 This feature is very important to me .. should I post something on the
 dev
 forum?  Not sure what the proper protocol is for adding a feature to the
 roadmap
 
 Thanks,
 Brian.
 -- 
 View this message in context: 
 http://www.nabble.com/Query-Filter-fq-with-OR-operator-tp23895837p24059181.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Query-Filter-fq-with-OR-operator-tp23895837p24222170.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting on text fields

2009-06-11 Thread Yao Ge

FYI. I did a direct integration with Carrot2 with Solrj with a separate Ajax
call from UI for top 100 hits to clusters terms in the two text fields. It
gots comparable performance to other facets in terms of response time. 

In terms of algorithms, their listed two Lingo and STC which I don't
reconize. But I think at least one of them might have used SVD
(http://en.wikipedia.org/wiki/Singular_value_decomposition).

-Yao


Otis Gospodnetic wrote:
 
 
 I'd call it related (their application in search encourages exploration),
 but also distinct enough to never mix them up.  I think your assessment
 below is correct, although I'm not familiar with the details of Carrot2
 any more (was once), so I can't tell you exactly which algo is used under
 the hood.
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: Michael Ludwig m...@as-guides.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, June 10, 2009 9:41:54 AM
 Subject: Re: Faceting on text fields
 
 Otis Gospodnetic schrieb:
 
  Solr can already cluster top N hits using Carrot2:
  http://wiki.apache.org/solr/ClusteringComponent
 
 Would it be fair to say that clustering as detailed on the page you're
 referring to is a kind of dynamic faceting? The faceting not being done
 based on distinct values of certain fields, but on the presence (and
 frequency) of terms in one field?
 
 The main difference seems to be that with faceting, grouping criteria
 (facets) are known beforehand, while with clustering, grouping criteria
 (the significant terms which create clusters - the cluster keys) have
 yet to be determined. Is that a correct assessment?
 
 Michael Ludwig
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23980124.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting on text fields

2009-06-11 Thread Yao Ge

BTW, Carrot2 has a very impressive Clustering Workbench (based on eclipse)
that has built-in integration with Solr. If you have a Solr service running,
it is a just a matter of point the workbench to it. The clustering results
and visualization are amazing. (http://project.carrot2.org/download.html).


Yao Ge wrote:
 
 FYI. I did a direct integration with Carrot2 with Solrj with a separate
 Ajax call from UI for top 100 hits to clusters terms in the two text
 fields. It gots comparable performance to other facets in terms of
 response time. 
 
 In terms of algorithms, their listed two Lingo and STC which I don't
 reconize. But I think at least one of them might have used SVD
 (http://en.wikipedia.org/wiki/Singular_value_decomposition).
 
 -Yao
 
 
 Otis Gospodnetic wrote:
 
 
 I'd call it related (their application in search encourages exploration),
 but also distinct enough to never mix them up.  I think your assessment
 below is correct, although I'm not familiar with the details of Carrot2
 any more (was once), so I can't tell you exactly which algo is used under
 the hood.
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: Michael Ludwig m...@as-guides.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, June 10, 2009 9:41:54 AM
 Subject: Re: Faceting on text fields
 
 Otis Gospodnetic schrieb:
 
  Solr can already cluster top N hits using Carrot2:
  http://wiki.apache.org/solr/ClusteringComponent
 
 Would it be fair to say that clustering as detailed on the page you're
 referring to is a kind of dynamic faceting? The faceting not being done
 based on distinct values of certain fields, but on the presence (and
 frequency) of terms in one field?
 
 The main difference seems to be that with faceting, grouping criteria
 (facets) are known beforehand, while with clustering, grouping criteria
 (the significant terms which create clusters - the cluster keys) have
 yet to be determined. Is that a correct assessment?
 
 Michael Ludwig
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23980959.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting on text fields

2009-06-10 Thread Yao Ge

Thanks for insight Otis. I have no awareness of ClusteringComponent until
now. It is time to move to Solr 1.4

-Yao

Otis Gospodnetic wrote:
 
 
 Yao,
 
 Solr can already cluster top N hits using Carrot2:
 http://wiki.apache.org/solr/ClusteringComponent
 
 I've also done ugly manual counting of terms in top N hits.  For
 example, look at the right side of this:
 http://www.simpy.com/user/otis/tag/%22machine+learning%22
 
 Something like http://www.sematext.com/product-key-phrase-extractor.html
 could also be used.
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: Yao Ge yao...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 9, 2009 3:46:13 PM
 Subject: Re: Faceting on text fields
 
 
 Michael,
 
 Thanks for the update! I definitely need to get a 1.4 build see if it
 makes
 a difference.
 
 BTW, maybe instead of using faceting for text
 mining/clustering/visualization purpose, we can build a separate feature
 in
 SOLR for this. Many of commercial search engines I have experiences with
 (Google Search Appliance, Vivisimo etc) provide dynamic term clustering
 based on top N ranked documents (N is a parameter can be configured).
 When
 facet field is highly fragmented (say a text field), the existing set
 intersection based approach might no longer be optimum. Aggregating term
 vectors over top N docs might be more attractive. Another features I can
 really appreciate is to provide search time n-gram term clustering. Maybe
 this might be better suited for spell checker as it just a different
 way
 to display the alternative search terms.
 
 -Yao
 
 
 Michael Ludwig-4 wrote:
  
  Yao Ge schrieb:
  
  The facet query is considerably slower comparing to other facets from
  structured database fields (with highly repeated values). What I found
  interesting is that even after I constrained search results to just a
  few hunderd hits using other facets, these text facets are still very
  slow.
 
  I understand that text fields are not good candidate for faceting as
  it can contain very large number of unique values. However why it is
  still slow after my matching documents is reduced to hundreds? Is it
  because the whole filter is cached (regardless the matching docs) and
  I don't have enough filter cache size to fit the whole list?
  
  Very interesting questions! I think an answer would both require and
  further an understanding of how filters work, which might even lead to
  a more general guideline on when and how to use filters and facets.
  
  Even though faceting appears to have changed in 1.4 vs 1.3, it would
  still be interesting to understand the 1.3 side of things.
  
  Lastly, what I really want to is to give user a chance to visualize
  and filter on top relevant words in the free-text fields. Are there
  alternative to facet field approach? term vectors? I can do client
  side process based on top N (say 100) hits for this but it is my last
  option.
  
  Also a very interesting data mining question! I'm sorry I don't have
 any
  answers for you. Maybe someone else does.
  
  Best,
  
  Michael Ludwig
  
  
 
 -- 
 View this message in context: 
 http://www.nabble.com/Faceting-on-text-fields-tp23872891p23950084.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23965401.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting on text fields

2009-06-09 Thread Yao Ge

Michael,

Thanks for the update! I definitely need to get a 1.4 build see if it makes
a difference.

BTW, maybe instead of using faceting for text
mining/clustering/visualization purpose, we can build a separate feature in
SOLR for this. Many of commercial search engines I have experiences with
(Google Search Appliance, Vivisimo etc) provide dynamic term clustering
based on top N ranked documents (N is a parameter can be configured). When
facet field is highly fragmented (say a text field), the existing set
intersection based approach might no longer be optimum. Aggregating term
vectors over top N docs might be more attractive. Another features I can
really appreciate is to provide search time n-gram term clustering. Maybe
this might be better suited for spell checker as it just a different way
to display the alternative search terms.

-Yao


Michael Ludwig-4 wrote:
 
 Yao Ge schrieb:
 
 The facet query is considerably slower comparing to other facets from
 structured database fields (with highly repeated values). What I found
 interesting is that even after I constrained search results to just a
 few hunderd hits using other facets, these text facets are still very
 slow.

 I understand that text fields are not good candidate for faceting as
 it can contain very large number of unique values. However why it is
 still slow after my matching documents is reduced to hundreds? Is it
 because the whole filter is cached (regardless the matching docs) and
 I don't have enough filter cache size to fit the whole list?
 
 Very interesting questions! I think an answer would both require and
 further an understanding of how filters work, which might even lead to
 a more general guideline on when and how to use filters and facets.
 
 Even though faceting appears to have changed in 1.4 vs 1.3, it would
 still be interesting to understand the 1.3 side of things.
 
 Lastly, what I really want to is to give user a chance to visualize
 and filter on top relevant words in the free-text fields. Are there
 alternative to facet field approach? term vectors? I can do client
 side process based on top N (say 100) hits for this but it is my last
 option.
 
 Also a very interesting data mining question! I'm sorry I don't have any
 answers for you. Maybe someone else does.
 
 Best,
 
 Michael Ludwig
 
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23950084.html
Sent from the Solr - User mailing list archive at Nabble.com.



Query Filter fq with OR operator

2009-06-05 Thread Yao Ge

If I want use OR operator with mutile query filters, I can do:
fq=popularity:[10 TO *] OR section:0
Is there a more effecient alternative to this?
-- 
View this message in context: 
http://www.nabble.com/Query-Filter-fq-with-OR-operator-tp23895837p23895837.html
Sent from the Solr - User mailing list archive at Nabble.com.



Faceting on text fields

2009-06-04 Thread Yao Ge

I am index a database with over 1 millions rows. Two of fields contain
unstructured text but size of each fields is limited (256 characters). 

I come up with an idea to use visualize the text fields using text cloud by
turning the two text fields in facets. The weight of font and size is of
each facet value (words) derived from the facet counts. I used simpler field
type so that the there is no stemming to these facet values:
fieldType name=word class=solr.TextField positionIncrementGap=100

  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=false/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=0 generateNumberParts=0 catenateWords=1
catenateNumbers=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType

The facet query is considerably slower comparing to other facets from
structured database fields (with highly repeated values). What I found
interesting is that even after I constrained search results to just a few
hunderd hits using other facets, these text facets are still very slow.  

I understand that text fields are not good candidate for faceting as it can
contain very large number of unique values. However why it is still slow
after my matching documents is reduced to hundreds? Is it because the whole
filter is cached (regardless the matching docs) and I don't have enough
filter cache size to fit the whole list?

The following is my filterCahce setting:
 filterCache class=solr.LRUCache size=5120 initialSize=512
autowarmCount=128/

Lastly, what I really want to is to give user a chance to visualize and
filter on top relevant words in the free-text fields. Are there alternative
to facet field approach? term vectors? I can do client side process based on
top N (say 100) hits for this but it is my last option.
-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23872891.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Faceting on text fields

2009-06-04 Thread Yao Ge

Yes. I am using 1.3. When is 1.4 due for release?


Yonik Seeley-2 wrote:
 
 Are you using Solr 1.3?
 You might want to try the latest 1.4 test build - faceting has changed a
 lot.
 
 -Yonik
 http://www.lucidimagination.com
 
 On Thu, Jun 4, 2009 at 12:01 PM, Yao Ge yao...@gmail.com wrote:

 I am index a database with over 1 millions rows. Two of fields contain
 unstructured text but size of each fields is limited (256 characters).

 I come up with an idea to use visualize the text fields using text cloud
 by
 turning the two text fields in facets. The weight of font and size is of
 each facet value (words) derived from the facet counts. I used simpler
 field
 type so that the there is no stemming to these facet values:
    fieldType name=word class=solr.TextField
 positionIncrementGap=100

      analyzer
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=false/
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
        filter class=solr.WordDelimiterFilterFactory
 generateWordParts=0 generateNumberParts=0 catenateWords=1
 catenateNumbers=1 catenateAll=0/
        filter class=solr.LowerCaseFilterFactory/
        filter class=solr.RemoveDuplicatesTokenFilterFactory/
      /analyzer
    /fieldType

 The facet query is considerably slower comparing to other facets from
 structured database fields (with highly repeated values). What I found
 interesting is that even after I constrained search results to just a few
 hunderd hits using other facets, these text facets are still very slow.

 I understand that text fields are not good candidate for faceting as it
 can
 contain very large number of unique values. However why it is still slow
 after my matching documents is reduced to hundreds? Is it because the
 whole
 filter is cached (regardless the matching docs) and I don't have enough
 filter cache size to fit the whole list?

 The following is my filterCahce setting:
     filterCache class=solr.LRUCache size=5120 initialSize=512
 autowarmCount=128/

 Lastly, what I really want to is to give user a chance to visualize and
 filter on top relevant words in the free-text fields. Are there
 alternative
 to facet field approach? term vectors? I can do client side process based
 on
 top N (say 100) hits for this but it is my last option.
 --
 View this message in context:
 http://www.nabble.com/Faceting-on-text-fields-tp23872891p23872891.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/Faceting-on-text-fields-tp23872891p23876051.html
Sent from the Solr - User mailing list archive at Nabble.com.



spell checking

2009-06-02 Thread Yao Ge

Can someone help providing a tutorial like introduction on how to get
spell-checking work in Solr. It appears many steps are requires before the
spell-checkering functions can be used. It also appears that a dictionary (a
list of correctly spelled words) is required to setup the spell checker. Can
anyone validate my impression?

Thanks.
-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23835427.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: spell checking

2009-06-02 Thread Yao Ge

Yes. I did. I was not able to grasp the concept of making spell checking
work.
For example, the wiki page says an spell check index need to be built. But
did not say how to do it. Does Solr buid the index out of thin air? Or the
index is buit from the main index? or index is built form a dictionary or
word list?

Please help.


Grant Ingersoll-6 wrote:
 
 Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent
 
 
 On Jun 2, 2009, at 8:50 AM, Yao Ge wrote:
 

 Can someone help providing a tutorial like introduction on how to get
 spell-checking work in Solr. It appears many steps are requires  
 before the
 spell-checkering functions can be used. It also appears that a  
 dictionary (a
 list of correctly spelled words) is required to setup the spell  
 checker. Can
 anyone validate my impression?

 Thanks.
 -- 
 View this message in context:
 http://www.nabble.com/spell-checking-tp23835427p23835427.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 
 --
 Grant Ingersoll
 http://www.lucidimagination.com/
 
 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
 using Solr/Lucene:
 http://www.lucidimagination.com/search
 
 
 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23840843.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: spell checking

2009-06-02 Thread Yao Ge

Sorry for not be able to get my point across.

I know the syntax that leads to a index build for spell checking. I actually
run the command saw some additional file created in data\spellchecker1
directory. What I don't understand is what is in there as I can not trick
Solr to make spell suggestions based on the documented query structure in
wiki. 

Can anyone tell me what happened after when the default spell check is
built? In my case, I used copyField to copy a couple of text fields into a
field called spell. These fields are the original text, they are the ones
with typos that I need to run spell check on. But how can these original
data be used as a base for spell checking? How does Solr know what are
correctly spelled words?

   field name=tech_comment type=text indexed=true stored=true
multiValued=true/
   field name=cust_comment type=text indexed=true stored=true
multiValued=true/
   ...
   field name=spell type=textSpell indexed=true stored=true
multiValued=true/
   ...
   copyField source=tech_comment dest=spell/
   copyField source=cust_comment dest=spell/



Yao Ge wrote:
 
 Can someone help providing a tutorial like introduction on how to get
 spell-checking work in Solr. It appears many steps are requires before the
 spell-checkering functions can be used. It also appears that a dictionary
 (a list of correctly spelled words) is required to setup the spell
 checker. Can anyone validate my impression?
 
 Thanks.
 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23841373.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: spell checking

2009-06-02 Thread Yao Ge

Excellent. Now everything make sense to me. :-)

The spell checking suggestion is the closest variance of user input that
actually existed in the main index. So called correction is relative the
text existed indexed. So there is no need for a brute force list of all
correctly spelled words. Maybe we should call this alternative search
terms or suggested search terms instead of spell checking. It is
misleading as there is no right or wrong in spelling, there is only popular
(term frequency?) alternatives.

Thanks for the insight.


Otis Gospodnetic wrote:
 
 
 Hello,
 
 In short, the assumption behind this type of SC is that the text in the
 main index is (mostly) correctly spelled.  When the SC finds query
 terms that are close in spelling to words indexed in SC, it offers
 spelling suggestions/correction using those presumably correctly spelled
 terms (there are other parameters that control the exact behaviour, but
 this is the idea)
 
 Solr (Lucene's spellchecker, which Solr uses under the hood, actually)
 turn the input text (values from those fields you copy to the spell field)
 into so called n-grams.  You can see that if you open up the SC index with
 something like Luke.  Please see
 http://wiki.apache.org/jakarta-lucene/SpellChecker .
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: Yao Ge yao...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 2, 2009 5:34:07 PM
 Subject: Re: spell checking
 
 
 Sorry for not be able to get my point across.
 
 I know the syntax that leads to a index build for spell checking. I
 actually
 run the command saw some additional file created in data\spellchecker1
 directory. What I don't understand is what is in there as I can not trick
 Solr to make spell suggestions based on the documented query structure in
 wiki. 
 
 Can anyone tell me what happened after when the default spell check is
 built? In my case, I used copyField to copy a couple of text fields into
 a
 field called spell. These fields are the original text, they are the
 ones
 with typos that I need to run spell check on. But how can these original
 data be used as a base for spell checking? How does Solr know what are
 correctly spelled words?
 
   
 multiValued=true/
   
 multiValued=true/
...
   
 multiValued=true/
...
   
   
 
 
 
 Yao Ge wrote:
  
  Can someone help providing a tutorial like introduction on how to get
  spell-checking work in Solr. It appears many steps are requires before
 the
  spell-checkering functions can be used. It also appears that a
 dictionary
  (a list of correctly spelled words) is required to setup the spell
  checker. Can anyone validate my impression?
  
  Thanks.
  
 
 -- 
 View this message in context: 
 http://www.nabble.com/spell-checking-tp23835427p23841373.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23844050.html
Sent from the Solr - User mailing list archive at Nabble.com.



Query Boost Functions

2009-05-18 Thread Yao Ge

I have a field named last-modified that I like to use in bf (Boot
Functions) parameter:
recip(rord(last-modified),1,1000,1000) in DisMaxRequestHander.
However the Solr query parser complain about the syntax of the formula. I
think it is related with hyphen in the field name. I have tried to add
single and double quote around the field name but didn't help.
 
Can field name contain hyphen in boot functions? How to do it? If not, where
do I find the field name special character restrictions?
 
-Yao

-- 
View this message in context: 
http://www.nabble.com/Query-Boost-Functions-tp23595860p23595860.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr Shard - Strange results

2009-05-18 Thread Yao Ge

Maybe you want to try with docNumber field type as string and see it would
make a difference.


CB-PO wrote:
 
 I'm not quite sure what logs you are talking about, but in the
 tomcat/logs/catalina.out logs, i found the following [note, i can't
 copy/paste, so i am typing up a summary]:
 
 I execute command: 
 localhost:8080/bravo/select?q=fredrows=102start=0shards=localhost:8080/alpha,localhost:8080/bravo
  
 In this example, alpha has 27 instances of fred, while bravo has 0.
 
 Then in the catalina.out:
 
 -There is the request for the command i sent, shards parameters and all. 
 it has the proper queryString.
 -Then I see the two requests sent to the shards, apha and bravo.  These
 two requests weave between each other until they are finished:
  INFO: REQUEST URI =/alpha/select
  INFO: REQUEST URI =/bravo/select
   The parameters have changed to:
  
 wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0
 
 -Then 2 INFO's scroll across:
 INFO: [] webapp=/bravo path=/select
 params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0}
 hits=0 status=0 QTime=1
 INFO: [] webapp=/alpha path=/select
 params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0}
 hits=27 status=0 QTime=1
 **Note, hits=27
 
 -Then i see some octet-streams being transferred, with status 200, so
 those are OK.
 
 -The i see something peculiar:
   It calls alpha with the following parameters: 
 wt=javabinversion=2.2ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55q=fredrows=102parameter=isShard=truestart=0
 
 Performing this query on my own (without the wt=javabin) gives me
 numFound=2, the result-set I get back from the overarching query.  
 Changing it to rows=10, it gives me numFound=2, and 2 doc's.  This is
 not the strange functionality I was seeing with the overarching query and
 the mis-matched numfound and doc's.
 
 This does beg the question.. why did it add:
 ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55 to the
 query?  They are the format that would be under docNumber, if that helps.. 
 Any thoughts?  I will do some research on those particular ID numbered
 docs, in the mean time.
 
 Here's the configuration information.  I only posted the difference from
 the default files in the solr/example/solr/conf
 
 [solrconfig.xml]
 config
   dataDir${solr.data.dir:/data/indices/bravo/solr/data/dataDir
   
   requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
   lst name=defaults
   str 
 name=config/data/indices/bravo/solr/conf/data-config.xml/str
   /lst
   /requestHandler
 config
 
 
 [schema.xml]
 schema
   fields
   field name=docNumber type=text indexed=true 
 stored=true /
   field name=column1 type=text indexed=true stored=true 
 /
   field name=column2 type=text indexed=true stored=true 
 /
   field name=column3 type=text indexed=true stored=true 
 /
   field name=column4 type=text indexed=true stored=true 
 /
   field name=column5 type=text indexed=true stored=true 
 /
   field name=column6 type=text indexed=true stored=true 
 /
   field name=column7 type=text indexed=true stored=true 
 /
   field name=column8 type=text indexed=true stored=true 
 /
   field name=column9 type=text indexed=true stored=true 
 /
   /fields
   uniqueKeydocNumber/uniqueKey
   defaultSearchFieldcolumn2/defaultSearchField
 /schema
 
 
 [data-config.xml]
 dataConfig
   dataSource type=JdbcDataSource driver=com.metamatrix.jdbc.MMDriver
 url=jdbc:metamatrix:b...@mms://hostname:port user=username
 password=password/
   document naame=DOC_NAME
   entity name=ENT_NAME query=select * from ASDF.TABLE
   field column=TABLE_COL_NO name=docNumber /
   field column=TABLE_COL_1 name=column1 /
   field column=TABLE_COL_2 name=column2 /
   field column=TABLE_COL_3 name=column3 /
   field column=TABLE_COL_4 name=column4 /
   field column=TABLE_COL_5 name=column5 /
   field column=TABLE_COL_6 name=column6 /
   field column=TABLE_COL_7 name=column7 /
   field column=TABLE_COL_8 name=column8 /
   field column=TABLE_COL_9 name=column9 /
   /entity
   /document
 /dataConfig
 
 
 
 
 
 Yonik Seeley-2 wrote:
 
 On Fri, May 15, 2009 at 4:11 PM, CB-PO charles.bush...@gmail.com wrote:
 Yeah, the first thing I thought of was that perhaps there was something
 wrong
 with the uniqueKey and they were clashing between the indexes, however
 upon
 visual inspection of the data the field we are using as the unique key
 in
 each of the indexes is grossly different between the two databases, so
 

DataImportHandler Template Transformer

2009-05-18 Thread Yao Ge

It took me a while to understand that to use the Template Transfomer
(http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/TemplateTransformer.html),
all building variable names (e.g. ${e.firstName} ${e.lastName} etc). can not
contain null values. I hope the parser can do a better job explaining it.
Also it will be nice to simple pad the null value will blank string. Should
this be considered as an enhancement?
-- 
View this message in context: 
http://www.nabble.com/DataImportHandler-Template-Transformer-tp23609267p23609267.html
Sent from the Solr - User mailing list archive at Nabble.com.