How to rank an exact match higher?

2012-03-05 Thread Tommy Chheng
(title:new york in 1), product of: 1.0 = tf(phraseFreq=1.0) 1.1890697 = idf(title: new=2 york=2) 1.0 = fieldNorm(field=title, doc=1)/str /lst I posted my solrconfig/schema here: https://gist.github.com/1984052 -- Tommy Chheng

Re: Solr with Scala

2012-02-06 Thread Tommy Chheng
... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Scala-tp3718539p3718539.html Sent from the Solr - User mailing list archive at Nabble.com. -- Tommy Chheng

Re: phrase auto-complete with suggester component

2012-01-25 Thread Tommy Chheng
archive at Nabble.com. -- Tommy Chheng

phrase auto-complete with suggester component

2012-01-24 Thread Tommy Chheng
:     fieldType name=text_auto class=solr.TextField      analyzer       tokenizer class=solr.KeywordTokenizerFactory/       filter class=solr.LowerCaseFilterFactory/      /analyzer     /fieldType    field name=title_autocomplete type=text_auto indexed=true stored=false multiValued=false / -- Tommy

Re: phrase auto-complete with suggester component

2012-01-24 Thread Tommy Chheng
#a3264740 which contains the solution to your problem. -- View this message in context: http://lucene.472066.n3.nabble.com/phrase-auto-complete-with-suggester-component-tp3685572p3685730.html Sent from the Solr - User mailing list archive at Nabble.com. -- Tommy Chheng

Re: snapshot-4.0 and maven

2010-10-26 Thread Tommy Chheng
You use maven-assembly-plugin's jar-with-dependencies to build a single jar with all its dependencies http://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven @tommychheng On 10/19/10 6:53 AM, Matt Mitchell wrote: Hey thanks Tommy. To be more

Re: snapshot-4.0 and maven

2010-10-18 Thread Tommy Chheng
Once you built the solr 4.0 jar, you can use mvn's install command like this: mvn install:install-file -DgroupId=org.apache -DartifactId=solr -Dpackaging=jar -Dversion=4.0-SNAPSHOT -Dfile=solr-4.0-SNAPSHOT.jar -DgeneratePom=true @tommychheng On 10/18/10 7:28 PM, Matt Mitchell wrote: I'd

Re: DIH - deleting documents, high performance (delta) imports, and passing parameters

2010-08-30 Thread Tommy Chheng
Thanks for the section on Passing parameters to DIH config: I'm going to try the parameter passing to allow the DIH to index different DBs based on the system environment(local dev machine or production machine) @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school

Re: specifying the doc id in clustering component

2010-08-20 Thread Tommy Chheng
Yes, that's the approach I'm taking right now. I do a lookup the doc ids in the resultset to find the matching document. I can live with the manual lookup, I wanted to see if it would be possible to pick a custom field to represent the document in the docs array. Thanks for contributing

changable DIH datasource based on environment variables

2010-08-17 Thread Tommy Chheng
I defined my DIH datasource in solrconfig.xml. Is there a way to define two sets of data sources and use one based on the current system's environment variable?(ex. APP_ENV=production or APP_ENV=development) I run the DIH on my local machine and remote server. They use different mysql

specifying the doc id in clustering component

2010-08-14 Thread Tommy Chheng
I'm using the clustering component with solr 1.4. The response is given by the id field in the doc array like: labels:[Devices], docs:[200066, 195650, 204850, Is there a way to change the doc label to be another field? i couldn't this option in

Re: DIH and multivariable fields problems

2010-08-06 Thread Tommy Chheng
For multiple value fields using the DIH, i use group_concat with the regextransformer's splitby: ex: entity dataSource=grad_schools query= SELECTgroup_concat(professors.name separator '|') as university_professors FROM professors WHERE

Re: Design questions/Schema Help

2010-07-26 Thread Tommy Chheng
Alternatively, have you considered storing(or i should say indexing) the search logs with Solr? This lets you text search across your search queries. You can perform time range queries with solr as well. @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based

DIH stalling, how to debug?

2010-07-22 Thread Tommy Chheng
Hi, When I run my DIH script, it says it's busy but the Total Requests made to DataSource and Total Rows Fetched remain unchanged at 4 and 6. It hasn't reported a failure. How can I debug what is blocking the DIH? -- @tommychheng Programmer and UC Irvine Graduate Student Find a great grad

Re: DIH stalling, how to debug?

2010-07-22 Thread Tommy Chheng
Ok, it was a runaway SQL query which isn't using an index. @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com On 7/22/10 4:26 PM, Tommy Chheng wrote: Hi, When I run my DIH script, it says it's busy

Re: csv response writer

2010-07-14 Thread Tommy Chheng
, so no future released version until then. Have you tested it out? Any feedback we should incorporate? When I can carve out some time over the next week or so I'll review and commit if there are no issues brought up. Erik On Jul 13, 2010, at 3:42 PM, Tommy Chheng wrote: Hi, Which

csv response writer

2010-07-13 Thread Tommy Chheng
Hi, Which next version of solr is the csv response writer set to be included in? https://issues.apache.org/jira/browse/SOLR-1925 -- @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com

Re: csv response writer

2010-07-13 Thread Tommy Chheng
no future released version until then. Have you tested it out? Any feedback we should incorporate? When I can carve out some time over the next week or so I'll review and commit if there are no issues brought up. Erik On Jul 13, 2010, at 3:42 PM, Tommy Chheng wrote: Hi, Which next version

Re: Query modification

2010-07-02 Thread Tommy Chheng
Hi, I actually did something similar on http://researchwatch.net/ if you search for stanford university solar, it will process the query by tagging the stanford university to the organization field. I created a querycomponent class and altered the query string like this(in scala but

Re: Query modification

2010-07-02 Thread Tommy Chheng
@tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on research interests: http://gradschoolnow.com On 7/2/10 3:26 PM, caman wrote: And what did you use for entity detection? GATE,openNLP? Do you mind sharing that please? From: Tommy Chheng-2 [via Lucene

dismax and AND as the default operator

2010-06-17 Thread Tommy Chheng
I'm using the dismax request handler and want to set the default operator to AND. Using the standard handler, i could just use the q.op or defaultOperator in the schema, but this doesn't work using the dismax request handler. For example, if I call solr/select/?q=fuel+cell, I want solr to

readonly access for all host except for localhost

2010-05-10 Thread Tommy Chheng
. -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com

Re: use a solr-built index with lucene?

2010-04-09 Thread Tommy Chheng
I was thinking of the reverse case: from solr to lucene. lucene doesn't use a schema.xml Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 4/9/10 12:15 AM, Paul Libbrecht wrote: This looks like an interesting avenue for a smooth transition

Re: Drill down a solr result set by facets

2010-03-29 Thread Tommy Chheng
Try adding quotes to your query: DepartmentName:Chemistry+fSponsor:\US Cancer/Diabetic Research Institute\ The parser will split on whitespace Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 3/29/10 8:49 AM, Dhanushka Samarakoon wrote

Re: document categorization using solr?

2010-03-25 Thread Tommy Chheng
/ClusteringComponent *i think the component meant to support small quantities of documents for supervised solutions(or larger scale unsupervised solutions), mahout could be a good start as it can use the solr index. Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http

Re: keyword query tokenizer

2010-03-25 Thread Tommy Chheng
Multi-field searches is one reason of doing the tokenizing in the parser. Imagine if your query was name:bob content:climate The parser can tokenize the query into name:bob, content:climate and pass each into their own analyzer. Tommy Chheng Programmer and UC Irvine Graduate Student Twitter

phrase segmentation plugin in component, analyzer, filter or parser?

2010-03-23 Thread Tommy Chheng
Is the SearchComponent the right class to extend for this type of logic? I picked the component because it was one place where i could get access to overwrite the whole query string. Or is it better design to write it as an analyzer, tokenizer, filter or parser plugin? -- Tommy Chheng Programmer and UC Irvine

trimfilterfactory on string fieldtype?

2010-03-18 Thread Tommy Chheng
name=string class=solr.StrField sortMissingLast=true omitNorms=true analyzer type=index filter class=solr.TrimFilterFactory / /analyzer /fieldType -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com

Re: XML data in solr field

2010-03-16 Thread Tommy Chheng
Do you have the option of just importing each xml node as a field/value when you add the document? That'll let you do the search easily. If you need to store the raw XML, you can use an extra field. Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http

Re: DIH field options

2010-03-12 Thread Tommy Chheng
=my_field column=static_value_not_from_db/ /entity /document Tommy Chheng-4 wrote: The wiki page has most of the info you need *http://wiki*.apache.org/*solr*/DataImportHandler To use multi-value fields, your schema.xml must define it with multiValued=true On 3/11/10 10:58 PM

Re: How to get Term Positions?

2010-03-12 Thread Tommy Chheng
I contributed a little reward to whoever can complete this task too http://nextsprocket.com/tasks/solr-1337-spans-and-payloads-query-support-asf-jira Feel free to contribute to the reward if you need this done too! Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng

Re: DIH field options

2010-03-11 Thread Tommy Chheng
add a static multi-value field?field name=category_ids values=123, 456/ Is there any documentation on all the options for the field tag in data-config.xml? Thanks for the help -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com

Re: persistent cache

2010-02-12 Thread Tommy Chheng
One solution is to add the persistent cache with memcache at the application layer. -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 2/12/10 5:19 AM, Tim Terlegård wrote: 2010/2/12 Shalin Shekhar Mangarshalinman...@gmail.com: 2010

DataImportHandlerException for custom DIH Transformer

2010-02-07 Thread Tommy Chheng
Transformer class defines a two parameter method: transformRow(MapString, Object row, Context context)? -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com

Re: Using solr to store data

2010-02-03 Thread Tommy Chheng
to failure. :) -- Tommy Chheng Developer UC Irvine Graduate Student http://tommy.chheng.com On 2/3/10 5:41 PM, AJ Asver wrote: Hi all, I work on search at Scoopler.com, a real-time search engine which uses Solr. We current use solr for indexing but then fetch data from our couchdb cluster using

filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynamic int fields but not dynamic string fields? ex. http://localhost:8983/solr/select?indent=onversion=2.2q=climate - correct

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
Thanks, quoting it fixed it. I'm also going to strip the leading/trailing whitespace at index time. Tommy On 1/20/10 1:47 PM, Erik Hatcher wrote: On Jan 20, 2010, at 4:27 PM, Tommy Chheng wrote: I'm having trouble doing a filter query on a string field. Any ideas why it's working

Re: Facet query help

2009-10-12 Thread Tommy Chheng
ok, so fq != facet.query. i thought it was an alias. I'm trying your suggestion fq=Memory_s:1 GB and now it's returning zero documents even though there is one document that has tommy and Memory_s:1 GB as seen in the original pastie(http://pastie.org/650932). I tried the fq query body with

Facet query help

2009-10-11 Thread Tommy Chheng
The dummy data set is composed of 6 docs. My query is set for 'tommy' with the facet query of Memory_s:1+GB http://lh:8983/solr/select/?facet=truefacet.field=CPU_sfacet.field=Memory_sfacet.field=Video+Card_swt=rubyfacet.query=Memory_s:1+GBq=tommyindent=on However, in the response