Re: mysolr python client

2011-12-01 Thread Marco Martinez
Done!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/12/1 Marc SCHNEIDER 

> Hi Marco,
>
> Great! Maybe you can add it on the Solr wiki? (
> http://wiki.apache.org/solr/IntegratingSolr).
>
> Regards,
> Marc.
>
> On Thu, Dec 1, 2011 at 10:42 AM, Jens Grivolla  wrote:
>
> > On 11/30/2011 05:40 PM, Marco Martinez wrote:
> >
> >> For anyone interested, recently I've been using a new Solr client for
> >> Python. It's easy and pretty well documented. If you're interested its
> >> site
> >> is: http://mysolr.redtuna.org/
> >>
> >
> > Do you know what advantages it has over pysolr or solrpy? On the page it
> > only says "mysolr was born to be a fast and easy-to-use client for Apache
> > Solr’s API and because existing Python clients didn’t fulfill these
> > conditions."
> >
> > Thanks,
> > Jens
> >
> >
>


mysolr python client

2011-11-30 Thread Marco Martinez
Hi all,

For anyone interested, recently I've been using a new Solr client for
Python. It's easy and pretty well documented. If you're interested its site
is: *http://mysolr.redtuna.org/*
*
*
bye!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: Error Instantiating QParserPlugin

2011-10-20 Thread Marco Martinez
its seem that the problem is QParserPlugin2 class

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/20 

> hi,
> while to create customized query parser plugin for solr 3.2. I got the
> Instantiating error.As mentioned at various places I created two
> classes 1) MyQParserPlugin extends QParserPlugin2) MyQParser extends
> QParser
> org.apache.solr.common.SolrException: Error Instantiating QParserPlugin,
> MyQParserPlugin is not a org.apache.solr.search.QParserPlugin
>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:428)
>at
> org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:448)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1548)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1542)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1575)
>at org.apache.solr.core.SolrCore.initQParsers(SolrCore.java:1492)
>at org.apache.solr.core.SolrCore.(SolrCore.java:558)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
>at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
>at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
>at
> org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
>at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
>at
> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
>at
> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
>at
> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
>at
> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at
> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
>at org.mortbay.jetty.Server.doStart(Server.java:224)
>at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>at java.lang.reflect.Method.invoke(Unknown Source)
>at org.mortbay.start.Main.invokeMain(Main.java:194)
>at org.mortbay.start.Main.start(Main.java:534)
>at org.mortbay.start.Main.start(Main.java:441)
>at org.mortbay.start.Main.main(Main.java:119)
> Any idea about whats going on??
> Thanks Karan


Re: Controlling the order of partial matches based on the position

2011-10-18 Thread Marco Martinez
Hi,

I would use a custom function query that uses termPositions to calculate the
order of the values in the field to accomplished your requirements.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 aronitin 

> Guys,
>
> It's been almost a week but there are no replies to the question that I
> posted.
>
> If its a small problem and already answered somewhere, please point me to
> that post. Otherwise please suggest any pointer to handle the requirement
> mentioned in the question,
>
> Nitin
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3429823.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr scraping: Nutch and other alternatives.

2011-10-18 Thread Marco Martinez
Hi Luis,

Have you tried the copyField function with custom analyzers and tokenizers?

bye,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 Luis Cappa Banda 

> Hello everyone.
>
> I've been thinking about a way to retrieve information from a domain (for
> example, http://www.ign.com) to process and index. My idea is to use Solr
> as
> a searcher. I'm familiarized with Apache Nutch and I know that the latest
> version has a gateway to Solr to retrieve and index information with it. I
> tried it and it worked fine, but it's a little bit complex to develop
> plugins to process info and index it in a new field desired. Perhaps one of
> you have tried another (and better) alternative to data mine web
> information. Which is your recommendation? Can you give me any scraping
> suggestion?
>
> Thank you very much.
>
> Luis Cappa.
>


Re: PositionIncrement gap and multi-valued fields.

2011-08-09 Thread Marco Martinez
Hi Luis,

As far as i know, the position increment gap only affects in some queries,
like phrase queries if you use the slop. The position incremente gap does
not affect  the similarity scoring formula of lucene :

score(q,d)   =
coord(q,d)
  ·  
queryNorm(q)
  · ∑( tf(t in 
d)
  ·  
idf(t)
2  ·  
t.getBoost()
 ·  
norm(t,d)
 )t in q*Lucene Practical Scoring Function*
*
*
*
*
The two first arguments are related to normalizes the queries. In the
summation, the two first arguments are related to the frequency of the term,
in the document and in the index, the third one is the boost of the term in
the query, and the final one, encapsulates a few (indexing time) boost and
length factors, but the lengths factor are calculated with the number of
terms so the position increment gap doesnt make more tokens, so this factor
neither affect the score.

But if you use, for example a multivalue field, with a position incremente
gap of 100, if you do a query with a slop less than 100, you prevent to have
matches between two separated values of this field, ex:

q=test:"A B"~99

doc1

A
B

You dont get any matches for this doc, but if you do this query q=test:"A
B"~101 you will get the doc1 as a match.


Bye!


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/8/8 Luis Cappa Banda 

> Hello!
>
> I have a doubt about the behaviour of searching over field types that have
> positionIncrementGap defined. For example, supose that:
>
>
>   1. We have a field called "test" defined as multi-valued and white space
>   tokenized.
>   2. The index has an single document with a "test" value:
>
> 
> TEST1
> 
> 
> AAA BBB
> 
> 
> CCC DDD
> 
> 
> EEE FFF
> 
> 
> TEST2
> 
>
>
> I read that positionIncrementGap defines the virtual space between the last
> token of one field instance and the first token of the next instance
> (source:
>
> http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html
> ).
> When it says "last token of one field instance" means that is the last
> token
> of the first entry from the multi-valued content? In our example before it
> will be "TEST1".
>
> Anyway, I've been doing some tests modifying the positionIncrementGap value
> with high values and low values. Can anybody explain me with detail which
> implications has in Solr scoring algorythm an upper and a lower value? I
> would like to understand how this value affects matching results in fields
> and also calculating the final score (maybe more gap implies more spaces
> and
> a worst score when the value matches, etc.).
>
> Thank you for reading so far!
>


Re: embeded solrj doesn't refresh index

2011-07-20 Thread Marco Martinez
You should send a commit to you embedded solr

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Jianbin Dai 

> Hi,
>
>
>
> I am using embedded solrj. After I add new doc to the index, I can see the
> changes through solr web, but not from embedded solrj. But after I restart
> the embedded solrj, I do see the changes. It works as if there was a cache.
> Anyone knows the problem? Thanks.
>
>
>
> Jianbin
>
>


Re: term positions performance

2011-07-20 Thread Marco Martinez
Also, i develop this query via function query, i wonder if i do it via a
normal query will increase the perfomance..

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Marco Martinez 

> Hi,
>
> I am developing a new query term proximity and i am using the term
> positions to get the positions of each term. I want to know if there is any
> clues to increase the perfomance of using term positions, in index time o in
> query time, all my fields that i am applying the term positions are indexed.
>
> Thanks in advance,
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


term positions performance

2011-07-20 Thread Marco Martinez
Hi,

I am developing a new query term proximity and i am using the term positions
to get the positions of each term. I want to know if there is any clues to
increase the perfomance of using term positions, in index time o in query
time, all my fields that i am applying the term positions are indexed.

Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function queries scope

2011-06-07 Thread Marco Martinez
Thanks, but its not what i'm looking for, because the BoostQParserPlugin
multiplies the score of the query with the function queries defined in the b
param of the BoostQParserPlugin. and i can't use the edismax because we have
our own qparser. Its seems that i have to code another qparser.


Thanks Yonik anyway,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/6/7 Yonik Seeley 

> One way is to use the boost qparser:
>
> http://search-lucene.com/jd/solr/org/apache/solr/search/BoostQParserPlugin.html
> q={!boost b=productValueField}shops in madrid
>
> Or you can use the edismax parser which as a "boost" parameter that
> does the same thing:
> defType=edismax&q=shops in madrid&boost=productValueField
>
>
> -Yonik
> http://www.lucidimagination.com
>
>
> On Tue, Jun 7, 2011 at 6:53 AM, Marco Martinez
>  wrote:
> > Hi,
> >
> > I need to use the function queries operations with the score of a given
> > query, but only in the docset that i get from the query and i dont know
> if
> > this is possible.
> >
> > Example:
> >
> > q=shops in madridreturns  1 docs  with a specific score for each
> doc
> >
> > but now i need to do some stuff like
> >
> > q=sum(product(2,query(shops in madrid),productValueField) but this will
> be
> > return all the docs in my index.
> >
> >
> > I know that i can do it via filter queries, ex,
> q=sum(product(2,query(shops
> > in madrid),productValueField)&fq=shops in madrid but this will do the
> query
> > two times and i dont want this because the performance is important to
> our
> > application.
> >
> >
> > Is there other approach to accomplished that=
> >
> >
> > Thanks in advance,
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
>


function queries scope

2011-06-07 Thread Marco Martinez
Hi,

I need to use the function queries operations with the score of a given
query, but only in the docset that i get from the query and i dont know if
this is possible.

Example:

q=shops in madridreturns  1 docs  with a specific score for each doc

but now i need to do some stuff like

q=sum(product(2,query(shops in madrid),productValueField) but this will be
return all the docs in my index.


I know that i can do it via filter queries, ex, q=sum(product(2,query(shops
in madrid),productValueField)&fq=shops in madrid but this will do the query
two times and i dont want this because the performance is important to our
application.


Is there other approach to accomplished that=


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
Its seems that is a problem of my own query, now i need to investigate if
there is something different between a normal query and my implementation of
the query, because if you use it alone, its works properly.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/13 Marco Martinez 

> No, this query returns a few more documents than if a do it by lucene query
> parser. I'm going to generate another query parser that send a simple term
> query and see what is the output, when i have it, i will inform in the mail.
>
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2011/4/12 Yonik Seeley 
>
>> On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
>>  wrote:
>> > Thanks but I tried this and I saw that this work in a standard scenario,
>> but
>> > in my query i use a my own query parser and it seems that they dont
>> doing
>> > the AND and returns all the docs in the index:
>> >
>> > My query:
>> > _query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned
>>
>> This would seem to point to your generated query {!bm25}car
>> matching all docs for some reason?
>>
>> -Yonik
>> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
>> 25-26, San Francisco
>>
>
>


Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
No, this query returns a few more documents than if a do it by lucene query
parser. I'm going to generate another query parser that send a simple term
query and see what is the output, when i have it, i will inform in the mail.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Yonik Seeley 

> On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
>  wrote:
> > Thanks but I tried this and I saw that this work in a standard scenario,
> but
> > in my query i use a my own query parser and it seems that they dont doing
> > the AND and returns all the docs in the index:
> >
> > My query:
> > _query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned
>
> This would seem to point to your generated query {!bm25}car
> matching all docs for some reason?
>
> -Yonik
> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> 25-26, San Francisco
>


Re: function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:

My query:
_query_:"{!bm25}car" AND _val_:marketValue -> 67000 docs returned


Solr query parser
car AND _val_:marketValue -> 300 docs returned


Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Erik Hatcher 

> Try using AND (or set q.op):
>
>   q=car+AND+_val_:marketValue
>
> On Apr 12, 2011, at 07:11 , Marco Martinez wrote:
>
> > Hi everyone,
> >
> > My situation is the next, I need to sum the value of a field to the score
> to
> > the docs returned in the query, but not to all the docs, example:
> >
> > q=car returns 3 docs
> >
> > 1-
> > name=car ford
> > marketValue=1
> > score=1.3
> >
> > 2-
> > name=car citroen
> > marketValue=2
> > score=1.3
> >
> > 3-
> > name=car mercedes
> > marketValue=0.5
> > score=1.3
> >
> > but if want to sum the marketValue to the score, my returned list is the
> > next:
> >
> > q=car+_val_:marketValue
> >
> > 1-
> > name=bus
> > marketValue=5
> > score=5
> >
> > 2-
> > name=car citroen
> > marketValue=2
> > score=3.3
> >
> > 3-
> > name=car ford
> > marketValue=1
> > score=2.3
> >
> > 4-
> > name=car mercedes
> > marketValue=0.5
> > score=1.8
> >
> >
> > Its possible to apply the function query only to the documents returned
> in
> > the first query?
> >
> >
> > Thanks in advance,
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
>
>


function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Hi everyone,

My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:

q=car returns 3 docs

1-
name=car ford
marketValue=1
score=1.3

2-
name=car citroen
marketValue=2
score=1.3

3-
name=car mercedes
marketValue=0.5
score=1.3

but if want to sum the marketValue to the score, my returned list is the
next:

q=car+_val_:marketValue

1-
name=bus
marketValue=5
score=5

2-
name=car citroen
marketValue=2
score=3.3

3-
name=car ford
marketValue=1
score=2.3

4-
name=car mercedes
marketValue=0.5
score=1.8


Its possible to apply the function query only to the documents returned in
the first query?


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: White space in facet values

2010-12-22 Thread Marco Martinez
try to copy the values (with copyfield) to a string field

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 Peter Karich 

>
>
> you should try fq=Product:"Electric Guitar"
>
>
> > How do I handle facet values that contain whitespace? Say I have a field
> "Product" that I want to facet on. A value for "Product" could be "Electric
> Guitar". How should I handle the white space in "Electric Guitar" during
> indexing? What about when I apply the constraint fq=Product:Electric Guitar?
>
> --
> http://jetwick.com open twitter search
>
>


Re: Different Results..

2010-12-22 Thread Marco Martinez
We need more information about the the analyzers and tokenizers of the
default field of your search

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 satya swaroop 

> Hi All,
> i am getting different results when i used with some escape keys..
> for example:::
> 1) when i use this request
>http://localhost:8080/solr/select?q=erlang!ericson
>   the result obtained is
>   
>
> 2) when the request is
> http://localhost:8080/solr/select?q=erlang/ericson
>the result is
>  
>
>
> My query here is, do solr consider both the queries differently and what do
> it consider for !,/ and all other escape characters.
>
>
> Regards,
> satya
>


Re: Solr search speed very low

2010-08-25 Thread Marco Martinez
You should use the tokenizer solr.WhitespaceTokenizerFactory in your field
type to get your terms indexed, once you have indexed the data, you dont
need to use the * in your queries that is a heavy query to solr.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/25 Andrey Sapegin 

> Dear ladies and gentlemen.
>
> I'm newbie with Solr, I didn't find an aswer in wiki, so I'm writing here.
>
> I'm analysing Solr performance and have 1 problem. *Search time is about
> 7-10 seconds per query.*
>
> I have a *.csv 5Gb-database with about 15 fields and 1 key field (record
> number). I uploaded it to Solr without any problem using curl. This database
> contains information about books and I'm intrested in keyword search using
> one of the fields (not a key field). I mean that if I search, for example,
> for word "Hello", I expect response with sentences containing "Hello":
> "Hello all"
> "Hello World"
> "I say Hello to all"
> etc.
>
> I tested it from console using time command and curl:
>
> /usr/bin/time -o test_results/time_solr -a curl "
> http://localhost:8983/solr/select/?q=itemname:*$query*&version=2.2&start=0&rows=10&indent=on";
> -6 2>&1 >> test_results/response_solr
>
> So, my query is *itemname:*$query**. 'Itemname' - is the name of field.
> $query - is a bash variable containing only 1 word. All works fine.
> *But unfortunately, search time is about 7-10 seconds per query.* For
> example, Sphinx spent only about 0.3 second per query.
> If I use only $query, without stars (*), I receive answer pretty fast, but
> only exact matches.
> And I want to see any sentence containing my $query in the response. Thats
> why I'm using stars.
>
> NOW THE QUESTION.
> Is my query syntax correct (*field:*word**) for keyword search)? Why
> response time is so big? Can I reduce search time?
>
> Thank You in advance,
> Kind Regards,
>
> Andrey Sapegin,
> Software Developer,
>
> Unister GmbH
> Barfußgässchen 11 | 04109 Leipzig
>
> andrey.sape...@unister-gmbh.de 
> www.unister.de 
>
>


Re: Search Results optimization

2010-08-13 Thread Marco Martinez
You can use a boost higher for stapler to accomplished your requirement.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/13 Hasnain 

>
> Hi All,
>
> My question is related to search results, I want to customize my query so
> that for query "stapler hammer", I should get results for all items
> containing word "stapler" first and then results containing hammer, right
> now results are mixing up, I want them sorted, i.e. all results of stapler
> on top and hammer on bottom not mixed, I havent changed any configuration
> files...
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Search-Results-optimization-tp1129374p1129374.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: index pdf files

2010-08-12 Thread Marco Martinez
To help you we need the description of your fields in your schema.xml and
the query that you do when you search only a single word.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C] 

> I wrote a simple java program to import a pdf file. I can get a result when
> I do search *:* from admin page. I get nothing if I search a word. I wonder
> if I did something wrong or miss set something.
>
> Here is part of result I get when do *:* search:
> *
> - 
> - 
>  Hristovski D
>  
> - 
>  application/pdf
>  
> - 
>  microarray analysis, literature-based discovery, semantic
> predications, natural language processing
>  
> - 
>  Thu Aug 12 10:58:37 EDT 2010
>  
> - 
>  Combining Semantic Relations and DNA Microarray Data for Novel
> Hypotheses Generation Combining Semantic Relations and DNA Microarray Data
> for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej
> Kastrin,2...
> *
> Please help me out if anyone has experience with pdf files. I really
> appreciate it!
>
> Thanks so much,
>
>


Re: custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi Otis,

Finally i construct my own function query that gives more score if the value
is at the start  of the field. But, its possible to tell solr to use
spanFirstQuery without coding. I think i have read that its no possible.

Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/18 Otis Gospodnetic 

> Marco,
>
> I don't think there is anything in Solr to do that (is there?), but you
> could do it with some coding if you combined the "regular query" with
> SpanFirstQuery with bigger boost:
>
>
> http://search-lucene.com/jd/lucene/org/apache/lucene/search/spans/SpanFirstQuery.html
>
> Oh, here are some examples and at the bottom you will see exactly what I
> suggested above:
>
>
> http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html||SpanFirstQuery<http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html%7C%7CSpanFirstQuery>
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Marco Martinez 
> > To: solr-user@lucene.apache.org
> > Sent: Fri, June 18, 2010 4:34:45 AM
> > Subject: custom scoring phrase queries
> >
> > Hi,
>
> I want to know if its posiible to get a higher score in a phrase
> > query when
> the matching is on the left side of the field. For
> > example:
>
>
> doc1=name:stores peter john
> doc2=name:peter john
> > stores
> doc3=name:peter john something
>
> if you do a search with
> > name="peter john" the resultset i want to get
> > is:
>
> doc2
> doc3
> doc1
>
> because the terms peter john are on the
> > left side of the field and they get
> a higher score.
>
> Thanks in
> > advance,
>
>
> Marco Martínez Bautista
>
> > href="http://www.paradigmatecnologico.com"; target=_blank
> > >http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª
> > Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi,

I want to know if its posiible to get a higher score in a phrase query when
the matching is on the left side of the field. For example:


doc1=name:stores peter john
doc2=name:peter john stores
doc3=name:peter john something

if you do a search with name="peter john" the resultset i want to get is:

doc2
doc3
doc1

because the terms peter john are on the left side of the field and they get
a higher score.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: Distributed Search doesn't response the result set

2010-06-08 Thread Marco Martinez
Is there a way to let "ID" not be "indexed" in solr?


If i am not wrong, this is not possible if you want distributed searches,
because solr uses internaly the ids to retrieve the correct pagination in a
distributed search, i mean, when you do a distributed search (ie two
shards), two searches are fired in parallel and mixed them to get the
correct sort, after these steps, solr get the documents (doing a search by
id) from the corresponding shards to retrieve the others fields of the
documents you have define in the search.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/8 Scott Zhang 

> Hi. Markus.
>
> Thanks for replying.
>
> I figured out the reason this afternoon. Sorry for not following up on this
> list. I posted it onto dev list because I think it is a BUG.
>
>
>
> 
> I finally know why it doesn't return the result.
>
> When I created the index, I set "id" field as "Stored" but not "indexed"
> because I don't see the reason to index the id.
> Then in schema.xml, I found I have to set "ID" as "indexed" but actually it
> is not.
>
> Not sure how solr is implemented internally. But without set id as
> "indexed", the distributed search doesn't work. I tried rebuild a test
> index
> with set ID as Indexed. Then let solr use that index and distributed search
> works.
>
> Is there a way to let "ID" not be "indexed" in solr?
>
> ===
>
>
>
>
> On Tue, Jun 8, 2010 at 7:38 PM,  wrote:
>
> > did  you send a commit after the last doc posted to solr?
> >
> > > -Ursprüngliche Nachricht-
> > > Von: Scott Zhang [mailto:macromars...@gmail.com]
> > > Gesendet: Dienstag, 8. Juni 2010 08:30
> > > An: solr-user@lucene.apache.org
> > > Betreff: Re: Distributed Search doesn't response the result set
> > >
> > > Hi. All.
> > > I am still testing. I think I am approaching the truth.
> > > Now confirmed:
> > > the doc in my existing lucene indexes, when search with
> > > distributed search,
> > > none of them are returned. But the docs inserted from solr
> > > post.jar are
> > > returned successfully.
> > > Don't know why. looks the lucene docs has some difference
> > > from solr's
> > > lucene.
> > > And my situation is, I already have 72 indexes folders
> > > which occupy lots
> > > of disk and repost them to solr will take very long time, so
> > > I have to stick
> > > with my existing index. Is there a solution for this?
> > >
> > > Thanks.
> > > Regards.
> > >
> > > On Tue, Jun 8, 2010 at 2:02 PM, Scott Zhang
> > >  wrote:
> > >
> > > > Hi. All.
> > > >   I tried with the default solr example plus my own
> > > config/schema file. I
> > > > post test document into solr manually. Then test the
> > > distributed search and
> > > > it works. Then I switch to my existing l*ucene index, and
> > > it d*oesn't
> > > > work.  So I am wondering is that the reason, when solr use
> > > lucene index,
> > > > then it can't be distributed searched?
> > > >
> > > >Welcome anyone help.
> > > >
> > > > Thanks.
> > > > Regards.
> > > > Scott
> > > >
> > > >
> > > > On Mon, Jun 7, 2010 at 4:48 PM, Scott Zhang
> > > wrote:
> > > >
> > > >> Is there a possibility caused by I am using my own lucene indexes.
> > > >> Not the one created by solr itself?
> > > >>
> > > >>
> > > >> Regards
> > > >> Scott
> > > >>
> > > >>
> > > >> On Mon, Jun 7, 2010 at 4:24 PM, Scott Zhang
> > > wrote:
> > > >>
> > > >>> Hi.
> > > >>> I tried URL:
> > > >>>
> > > http://localhost:8983/solr/select?shards=localhost:8983/solr,l
> > > ocalhost:7574/solr&indent=true&q=marship&rows=10
> > > >>>  Got:
> > > >>> 
> > > >>> -
> > > >>> 
> > > >>> 0
> > > >>> 16
> > > >>&g

Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Try to put the rows parameter in your request, i guess that in your
solrconfig you have configured the default rows to 0 in your default request
handler.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang 

> Thanks for replying.
>
> Here is the part of my schema.xml:
> I only have 4 fields in my document.
>
> 
>
>required="true" />
>required="true"/>
>   
>   
>
>
>
>
>   
>   
>   
>   
>   
>   
>   
>   
>
>   
>   
>   
>   
>   
>   
>
>   
>
>   
>multiValued="true"/>
>
>   
>
>
>
>  
>
>  id
>
>
> I am running 2 instances as tutorial shows: one on 8983. Another one is on
> 7574.
> When I search on 8983:
> URL:
>
> http://localhost:8983/solr/select/?q=marship&version=2.2&start=0&rows=10&indent=on
> I got:
>
> 
> -
> 
> 89
> product
> 
> -
> 
> 90
> product
> 
> ..
>
>
> when I search on 7574:
> URL:
>
> http://localhost:7574/solr/select/?q=marship&version=2.2&start=0&rows=10&indent=on
> I got:
> 
> -
> 
> 89
> product
> 
> -
> 
> 90
> product
> 
> -
> 
> 91
> product
> 
> 
>
> As they are using 2 copies of same lucene indexes. the result is same.
> Then I use
> URL:
>
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=marship
> I got:
> 
> -
> 
> 0
> 31
> -
> 
> true
> marship
> localhost:8983/solr,localhost:7574/solr
> 
> 
> 
> 
>
> Note the numFound is 14.
> When I try URL:
>
> http://localhost:8983/solr/select?shards=localhost:8983/solr/&indent=true&q=marship
> The numFound="7" but still nothing returned.
>
> URL:
>
> http://localhost:8983/solr/select?shards=localhost:7574/solr/&indent=true&q=marship
> return numFound="7" too. And the result has nothing.
>
> Please help.
>
> Thanks.
> Regards.
> Scott
>
>
> On Mon, Jun 7, 2010 at 3:47 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi Scott,
> >
> > We need more information about your request, can you put the query that
> you
> > are doing to the servers.
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/6/7 Scott Zhang 
> >
> > > Hi. All.
> > >   I am trying to use solr to search over 2 lucene indexes.  I am
> > following
> > > the solr tutorial and test the distributed search example. It works.
> > >   Then I am using my own lucene indexes. Search in each solr instance
> > works
> > > and return the expected result. But when I do distributed search using
> > > "shards". It only return the "numFound"=14. But the result contain
> > nothing.
> > >Don't know why. Can Any one help? Thanks.
> > >
> >
>


Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Hi Scott,

We need more information about your request, can you put the query that you
are doing to the servers.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang 

> Hi. All.
>   I am trying to use solr to search over 2 lucene indexes.  I am following
> the solr tutorial and test the distributed search example. It works.
>   Then I am using my own lucene indexes. Search in each solr instance works
> and return the expected result. But when I do distributed search using
> "shards". It only return the "numFound"=14. But the result contain nothing.
>Don't know why. Can Any one help? Thanks.
>


Re: solr.solr.home

2010-05-27 Thread Marco Martinez
Hi,

When you start the tomcat, you can specify the properties, it will be
something like this -Dsolr.solr.home=path/to/your/solr/home. For example, in
linux ./startup.sh -Dsolr.solr.home=path/to/your/solr/home



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/27 Antonello Mangone 

> But where I have to write this command ???
>
> System.setProperty("solr.solr.home",
> > "whateverpathyou'dliketosetonyourfilesystem");
> >
> > Claudio
> >
>


Re: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Marco Martinez
Maybe this will help you

http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/26 bbarani 

>
> Hi,
>
> Sorry if I am asking this question again in this forum..
>
> Is there any plugin which I can use to do a realtime indexing?
>
> I have a requirement where we have an application which sits on top of SQL
> server DB and updates happen on day to day basis. Users would like to see
> the changes made to the DB immediately in the search results. I am thinking
> of using JMS queue for achieving this, but before that I just want to check
> if anyone has implemented similar kind of requirement before?
>
> Any help / suggestions would be greatly appreciated.
>
> Thanks,
> bb
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Any-realtime-indexing-plugin-available-for-SOLR-tp845026p845026.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Storing RandomSortField

2010-05-19 Thread Marco Martinez
Hi Alexandre,

I am not totally sure about this, but the random sort field its only used to
do a random sort on your searchs, and you will to pass differents values to
have differents sorts, so this only applies in the searchs, so no value is
indexed. You will find more information here:
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Alexandre Rocco 

> Hi guys,
>
> Is there any way to mak a RandomSortField be stored?
> I'm trying to do it for debugging purposes,
> My intention is to take a look at the values that are stored there to
> determine the sorting that is being applied to the results.
>
> I tried to make it a stored field as:
> 
>
> And also tried to create another text field, copying the result from the
> random field like this:
> 
> 
>
> Neither of the approaches worked.
> Is there any restriction on this kind of field that prevents it from being
> displayed in the results?
>
> Thanks,
> Alexandre
>


Re: disable caches in real time

2010-05-19 Thread Marco Martinez
Hi Chris,

Thank you for your answer.

I've always undestand that if you do a commit (replication does it), a new
searcher is open, and you lose performance (queries per second) while the
caches are regenerated. I think i don't explain correctly my situation
before, with my schema i want to avoid this loss of performance in an
enviroment with frequent updates.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Chris Hostetter 

> : I want to know if there is any approach to disable caches in a specific
> core
> : from a multicore server.
>
> only via hte config.
>
> : I have a multicore server where the core0 will be listen to the queries
> and
> : other core (core1) that will be replicated from a master server. Once the
> : replication has been done, i will swap the cores. My point is that i want
> to
> : disable the caches in the core that is in charge of the replication to
> save
> : memory in the machine.
>
> that seems bizarely complicated -- replication can work against a "live"
> core, no need to do the swap yourself, the replicationHandler takes care
> of this for your transparently (ie: you have one core, replicating from a
> master -- the old index will be searched by users, and have caches, and
> when the new version of the index is ready, the replication handler will
> swap the *index* in that core (but the core itself never changes) ... it
> can even autowarm the caches on the new index for you before the swap if
> you configure it that way.
>
> -Hoss
>
>


Re: Multifaceting on multivalued field

2010-05-18 Thread Marco Martinez
Hi,

This exception is fired when you don't have this field on your index, but
this comes because you have an error in your query syntax  !{ex=cars}cars,
should be {*!*ex=cars}cars , whith the exclamation inside the brackets.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Peter Karich 

> Hi all,
>
> I read about multifaceting [1] and tried it for myself. With
> multifaceting I would like to conserve the number of documents for the
> 'un-facetted case'. This works nice with normal fields, but I get an
> exception [2] if I apply this on a multivalued field.
> Is this a bug or logical :-) ? If the latter one is the case, would
> anybody help me to understand this?
>
> Regards,
> Peter.
>
> [1]
>
> http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html
>
> [2]
> org.apache.solr.common.SolrException: undefined field !{ex=cars}cars
>at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077)
>at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226)
>at
>
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283)
>at
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166)
>at
>
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
>at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
>at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
>
>


Re: Targeting two fields with the same query or one field gathering contents from both ?

2010-05-17 Thread Marco Martinez
No, the equivalent for this will be:

- A: (the lazy fox) *OR* B: (the lazy fox)
- C: (the lazy fox)


Imagine the situation that you dont have in B 'the lazy fox', with the AND
you get 0 results although you have 'the lazy fox' in A and C

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/17 Xavier Schepler 

> Hey,
>
> let's say  I have :
>
> - a field named A with specific contents
>
> - a field named B with specific contents
>
> - a field named C witch contents only from A and B added with copyField.
>
> Are those queries equivalents in terms of performance :
>
> - A: (the lazy fox) AND B: (the lazy fox)
> - C: (the lazy fox)
>
> ??
>
> Thanks,
>
> Xavier
>
>
>
>


Re: disable caches in real time

2010-05-17 Thread Marco Martinez
Any suggestions?

I have thought in have two configurations per server and reload each one
with the appropiated config file but i would prefer another solution if its
possible.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/14 Marco Martinez 

> Hi,
>
> I want to know if there is any approach to disable caches in a specific
> core from a multicore server.
>
> My situation is the next:
>
> I have a multicore server where the core0 will be listen to the queries and
> other core (core1) that will be replicated from a master server. Once the
> replication has been done, i will swap the cores. My point is that i want to
> disable the caches in the core that is in charge of the replication to save
> memory in the machine.
>
> Any suggestions will be appreciated.
>
> Thanks in advance,
>
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>


disable caches in real time

2010-05-14 Thread Marco Martinez
Hi,

I want to know if there is any approach to disable caches in a specific core
from a multicore server.

My situation is the next:

I have a multicore server where the core0 will be listen to the queries and
other core (core1) that will be replicated from a master server. Once the
replication has been done, i will swap the cores. My point is that i want to
disable the caches in the core that is in charge of the replication to save
memory in the machine.

Any suggestions will be appreciated.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: Question on pf (Phrase Fields)

2010-05-13 Thread Marco Martinez
I don't know if this solution accomplished your requirements but you can use
fq to do the query with only "foo" and q when you search by more terms.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/13 Blargy 

>
> Is there any way to configure this so it only takes after if you match more
> than one word?
>
> For example if I search for: "foo" it should have no effect on scoring, but
> if I search for "foo bar" then it should.
>
> Is this possible? Thanks
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Question-on-pf-Phrase-Fields-tp815095p815095.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
You should do a preprocessing(multiply your document as many documents as
values you have in your multivalue field, with the principalFlag:T in your
first document) before you indexing the data with that logic

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods 

> hi Marco,
>
> Thanks for quick reply..
> I have another doubt: In 2nd solution: How to set flag for duplicate value.
> because I am not sure about the no fo duplicate rows (it could be random
> no..)
> so how can I set the flag..
> thank
>
> On Wed, May 12, 2010 at 12:59 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi,
> >
> > 2º solution:
> >
> > Not use multiValue fields, instead use two single fields, in your example
> > will be:
> >
> > doc1:
> > dept: student1
> > city: city1
> > principalFlag:T
> > doc2:
> > dept: student2
> > city: city2
> > principalFlag:F
> >
> > So, if you search without specify any city or dept, you should put
> > princiaplFlag:T for no get duplicate on your response. And if you specify
> a
> > city or a dept, there is no need to specify the principalFlag because you
> > will only get the result that match with your fields (you dont get
> > duplicates).
> >
> > 3º solution:
> >
> > Do a postprocessing to eleminate the fields in your response that you
> dont
> > need, i mean, get only the city and the dept that should be in the query
> > response.
> >
> > Hope this will help
> >
> >
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/12 Jonty Rhods 
> >
> > > Hi Marco,
> > >
> > > I am trying to patch for collapse component support (till now no
> luck)..
> > > In mean time I would like to know the 2nd and 3rd option you mentioned
> > > (logic in solrj)..
> > >
> > > with regards
> > >
> > > On Thu, May 6, 2010 at 2:36 PM, Marco Martinez <
> > > mmarti...@paradigmatecnologico.com> wrote:
> > >
> > > > Hi Jonty,
> > > >
> > > > I think you have three possible solutions:
> > > >
> > > >
> > > >   1. Use the collapse component with your name field for not have any
> > > >   duplicates documents.
> > > >   2. Create a simple logic in your index with flags, like one flag to
> > > >   determine the first element of the same document (in your example
> you
> > > > will
> > > >   have three differents documents and the fist one wiill have this
> > > > flag=true).
> > > >   If the search only have name, you will have to set this flag to
> true,
> > > if
> > > >   not, the dept or the student will be defined and you will have one
> > > > document
> > > >   returned.
> > > >   3. Do a post-processing of your data.
> > > >
> > > > Maybe you will have more solutions but these are what i have thought
> > > right
> > > > now.
> > > >
> > > > Regards,
> > > >
> > > >
> > > > Marco Martínez Bautista
> > > > http://www.paradigmatecnologico.com
> > > > Avenida de Europa, 26. Ática 5. 3ª Planta
> > > > 28224 Pozuelo de Alarcón
> > > > Tel.: 91 352 59 42
> > > >
> > > >
> > > > 2010/5/6 Jonty Rhods 
> > > >
> > > > > thanks
> > > > >
> > > > > :General solution is to index 3 different SolrDocument in your
> > example.
> > > > id
> > > > > and name fields will repeat themselves. All fields will be
> > > single-valued.
> > > > >
> > > > > if I am indexing 3 different field then if user is searching by
> name
> > +
> > > > dept
> > > > > then it will return duplicate value.. is there any other best
> > possible
> > > > > way..?
> > > > >
> > > > > thanks
> > > > > On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan 
> > > wrote:
> > > > >
> > > > > >
> > > > > > > recently I start to work on solr, So I am still very new to
> > > > > > > use solr. Sorry
> > >

Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
Hi,

2º solution:

Not use multiValue fields, instead use two single fields, in your example
will be:

doc1:
dept: student1
city: city1
principalFlag:T
doc2:
dept: student2
city: city2
principalFlag:F

So, if you search without specify any city or dept, you should put
princiaplFlag:T for no get duplicate on your response. And if you specify a
city or a dept, there is no need to specify the principalFlag because you
will only get the result that match with your fields (you dont get
duplicates).

3º solution:

Do a postprocessing to eleminate the fields in your response that you dont
need, i mean, get only the city and the dept that should be in the query
response.

Hope this will help



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods 

> Hi Marco,
>
> I am trying to patch for collapse component support (till now no luck)..
> In mean time I would like to know the 2nd and 3rd option you mentioned
> (logic in solrj)..
>
> with regards
>
> On Thu, May 6, 2010 at 2:36 PM, Marco Martinez <
> mmarti...@paradigmatecnologico.com> wrote:
>
> > Hi Jonty,
> >
> > I think you have three possible solutions:
> >
> >
> >   1. Use the collapse component with your name field for not have any
> >   duplicates documents.
> >   2. Create a simple logic in your index with flags, like one flag to
> >   determine the first element of the same document (in your example you
> > will
> >   have three differents documents and the fist one wiill have this
> > flag=true).
> >   If the search only have name, you will have to set this flag to true,
> if
> >   not, the dept or the student will be defined and you will have one
> > document
> >   returned.
> >   3. Do a post-processing of your data.
> >
> > Maybe you will have more solutions but these are what i have thought
> right
> > now.
> >
> > Regards,
> >
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/6 Jonty Rhods 
> >
> > > thanks
> > >
> > > :General solution is to index 3 different SolrDocument in your example.
> > id
> > > and name fields will repeat themselves. All fields will be
> single-valued.
> > >
> > > if I am indexing 3 different field then if user is searching by name +
> > dept
> > > then it will return duplicate value.. is there any other best possible
> > > way..?
> > >
> > > thanks
> > > On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan 
> wrote:
> > >
> > > >
> > > > > recently I start to work on solr, So I am still very new to
> > > > > use solr. Sorry
> > > > > if I am logically wrong.
> > > > > I have two table, parent and referenced (child).
> > > > >
> > > > > for that I set multivalue field following is my schema
> > > > > details
> > > > >   > > > > stored="true" required="true"
> > > > > />
> > > > >
> > > > >
> > > > > > > > > indexed="true" stored="true"/>
> > > > >
> > > > > > > > > indexed="true" stored="true"
> > > > > multiValued="true"/>
> > > > > > > > > indexed="true" stored="true"
> > > > > multiValued="true"/>
> > > > >
> > > > > indexed data details:
> > > > >
> > > > > 
> > > > >
> > > > >   
> > > > > student1
> > > > > student2
> > > > > student3
> > > > >   
> > > > >
> > > > >   
> > > > > city1
> > > > > city2
> > > > > city3
> > > > >   
> > > > >  1
> > > > >
> > > > >  
> > > > >name of emp
> > > > >   
> > > > >
> > > > > 
> > > > >
> > > > > now my question is :
> > > > > When user is searching by city2 then I want to return
> > > > > employee2 and their id
> > > > > (for multi value field).
> > > > > something like:
> > > > >
> > > > > 
> > > > >
> > > > >   
> > > > >
> > > > > student2
> > > > >
> > > > >   
> > > > >
> > > > >   
> > > > >
> > > > > city2
> > > > >
> > > > >   
> > > > >  1
> > > > >
> > > > >  
> > > > >name of emp
> > > > >   
> > > > >
> > > > > 
> > > > >
> > > >
> > > > I had a similar need before. AFAIK you cannot do it with multivalued
> > > > fields. The indexing order is preserved in multivalued field. May be
> > you
> > > can
> > > > post-process returned fields and capture correct position of matched
> > city
> > > > field, and use this index to display correct dept value. But this is
> > easy
> > > if
> > > > you are using string or integer type for city and dept.
> > > >
> > > > General solution is to index 3 different SolrDocument in your
> example.
> > id
> > > > and name fields will repeat themselves. All fields will be
> > single-valued.
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>


Re: JTeam Spatial Plugin

2010-05-12 Thread Marco Martinez
Hi,


You can use localsolr  (http://www.gissearch.com/localsolr) that supports
sharding if you need this feature.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/11 Jean-Sebastien Vachon 

> Hi,
>
> Thanks for your suggestion but I received more information about this issue
> from one of the JTeam's developer and he told me that
> my problem was caused by the plugin not supporting sharding at this time.
>
> In my case, I noticed that individual shards were computing the distance
> through the geo_distance field.
> However, the "master" Solr instance controlling the shards was kind of
> loosing this information from the lack of support for shards.
>
> For now there is no quick work around that I know of.
>
> Later,
>
> On 2010-05-11, at 2:54 PM, Michael wrote:
>
> > Try using "geo_distance" in the return fields.
> >
> > On Thu, Apr 29, 2010 at 9:26 AM, Jean-Sebastien Vachon
> >  wrote:
> >> Hi All,
> >>
> >> I am using JTeam's Spatial Plugin RC3 to perform spatial searches on my
> index and it works great. However, I can't seem to get it to return the
> computed distances.
> >>
> >> My query component is run before the geoDistanceComponent and the
> distanceField is set to "distance"
> >> Fields for lat/long are defined as well and the different tiers field
> are in the results. Increasing the radius cause the number of matches to
> increase so I guess that my setup is working...
> >>
> >> Here is sample query and its output (I removed some of the fields to
> keep it short):
> >>
> >>
> /select?passkey=sample&q={!spatial%20lat=40.27%20long=-76.29%20radius=22%20calc=arc}title:engineer&wt=json&indent=on&fl=*,distance
> >>
> >> 
> >>
> >> {
> >>  "responseHeader":{
> >>  "status":0,
> >>  "QTime":69,
> >>  "params":{
> >>"fl":"*,distance",
> >>"indent":"on",
> >>"q":"{!spatial lat=40.27 long=-76.29 radius=22
> calc=arc}title:engineer",
> >>"wt":"json"}},
> >>  "response":{"numFound":223,"start":0,"docs":[
> >>{
> >>
> >> "title":"Electrical Engineer",
> >>"long":-76.3054962158203,
> >> "lat":40.037899017334,
> >> "_tier_9":-3.004,
> >> "_tier_10":-6.0008,
> >> "_tier_11":-12.0016,
> >> "_tier_12":-24.0031,
> >> "_tier_13":-47.0061,
> >> "_tier_14":-93.00122,
> >> "_tier_15":-186.00243,
> >> "_tier_16":-372.00485},
> >> }}
> >>
> >> This output suggests to me that everything is in place. Anyone knows how
> to fetch the computed distance? I tried adding the field 'distance' to my
> list of fields but it didn't work
> >>
> >> Thanks
> >>
>
>


Re: hi to everyone

2010-05-06 Thread Marco Martinez
See this page
http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curland
the solr tutorial
http://lucene.apache.org/solr/tutorial.html (maybe you can use the
post.jar).

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone 

> Ok, you're right :D
>
> I exaplain my situation ...
>
> I have solr locally on my machine
>
> */home/antonello/solrtest*
>
> inside the folder solrtest I have:
>
> |_ build
> |_ build.xml
> |_ CHANGES.txt
> |_ client
> |_ common-build.xml
> |_ contrib
> |_ dist
> |_ docs
> |_ etc
> |_ lib
> |_ LICENSE.txt
> |_ logs
> |_ multicore
>|_ bandb
>|_ conf
>|_ schema.xml
>|_ solrconfig.xml
>|_ data
>|_ index
>|_ segments_1
>|_ segments.gen
>|_ solr.xml
> |_ NOTICE.txt
> |_ README.txt
> |_ src
> |_ start.jar
> |_ start_multicore.sh
> |_ webapps
>
>
> I have also xml files in anoter place and I would like to add these xml
> files to the bandb core.
> Is there a command to add an xml file to a particular core, imagining we
> can
> have an indefinite number of cores ?
>
>
>
>
>
> 2010/5/6 Marco Martinez 
>
> > You should specify the core in your request, like
> > http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
> > webapp and 'core0' is the name of the core.
> >
> > Marco Martínez Bautista
> > http://www.paradigmatecnologico.com
> > Avenida de Europa, 26. Ática 5. 3ª Planta
> > 28224 Pozuelo de Alarcón
> > Tel.: 91 352 59 42
> >
> >
> > 2010/5/6 Antonello Mangone 
> >
> > > Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
> > > (this is the 4th day :D).
> > > I'm just a novice and i would like to make a question ...
> > >
> > > I'm using solr in multicore way but i don't understad how to add xml
> > > documents to a particular core ...
> > > Can someone help me ???
> > >
> > > Antonello
> > >
> >
>


Re: hi to everyone

2010-05-06 Thread Marco Martinez
You should specify the core in your request, like
http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
webapp and 'core0' is the name of the core.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone 

> Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
> (this is the 4th day :D).
> I'm just a novice and i would like to make a question ...
>
> I'm using solr in multicore way but i don't understad how to add xml
> documents to a particular core ...
> Can someone help me ???
>
> Antonello
>


Re: multivalue fields logic required

2010-05-06 Thread Marco Martinez
Hi Jonty,

I think you have three possible solutions:


   1. Use the collapse component with your name field for not have any
   duplicates documents.
   2. Create a simple logic in your index with flags, like one flag to
   determine the first element of the same document (in your example you will
   have three differents documents and the fist one wiill have this flag=true).
   If the search only have name, you will have to set this flag to true, if
   not, the dept or the student will be defined and you will have one document
   returned.
   3. Do a post-processing of your data.

Maybe you will have more solutions but these are what i have thought right
now.

Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Jonty Rhods 

> thanks
>
> :General solution is to index 3 different SolrDocument in your example. id
> and name fields will repeat themselves. All fields will be single-valued.
>
> if I am indexing 3 different field then if user is searching by name + dept
> then it will return duplicate value.. is there any other best possible
> way..?
>
> thanks
> On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan  wrote:
>
> >
> > > recently I start to work on solr, So I am still very new to
> > > use solr. Sorry
> > > if I am logically wrong.
> > > I have two table, parent and referenced (child).
> > >
> > > for that I set multivalue field following is my schema
> > > details
> > >   > > stored="true" required="true"
> > > />
> > >
> > >
> > > > > indexed="true" stored="true"/>
> > >
> > > > > indexed="true" stored="true"
> > > multiValued="true"/>
> > > > > indexed="true" stored="true"
> > > multiValued="true"/>
> > >
> > > indexed data details:
> > >
> > > 
> > >
> > >   
> > > student1
> > > student2
> > > student3
> > >   
> > >
> > >   
> > > city1
> > > city2
> > > city3
> > >   
> > >  1
> > >
> > >  
> > >name of emp
> > >   
> > >
> > > 
> > >
> > > now my question is :
> > > When user is searching by city2 then I want to return
> > > employee2 and their id
> > > (for multi value field).
> > > something like:
> > >
> > > 
> > >
> > >   
> > >
> > > student2
> > >
> > >   
> > >
> > >   
> > >
> > > city2
> > >
> > >   
> > >  1
> > >
> > >  
> > >name of emp
> > >   
> > >
> > > 
> > >
> >
> > I had a similar need before. AFAIK you cannot do it with multivalued
> > fields. The indexing order is preserved in multivalued field. May be you
> can
> > post-process returned fields and capture correct position of matched city
> > field, and use this index to display correct dept value. But this is easy
> if
> > you are using string or integer type for city and dept.
> >
> > General solution is to index 3 different SolrDocument in your example. id
> > and name fields will repeat themselves. All fields will be single-valued.
> >
> >
> >
> >
> >
>


Re: synonym filter problem for string or phrase

2010-05-03 Thread Marco Martinez
Hi Ranveer,

I don't see any stemming analyzer in your configuration of the field
'text_sync', also you have  at
query time and not at index time, maybe that is your problem.


Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/30 Jonty Rhods 

> On 4/29/10 8:50 PM, Marco Martinez wrote:
>
> Hi Ranveer,
>
> If you don't specify a field type in the q parameter, the search will be
> done searching in your default search field defined in the solrconfig.xml,
> its your default field a text_sync field?
>
> Regards,
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2010/4/29 Ranveer 
>
>
>
> Hi,
>
> I am trying to configure synonym filter.
> my requirement is:
> when user searching by phrase like "what is solr user?" then it should be
> replace with "solr user".
> something like : what is solr user? =>  solr user
>
> My schema for particular field is:
>
>  positionIncrementGap="100">
> 
> 
> 
>
> 
> 
> 
> 
> 
>  ignoreCase="true" expand="true"
> tokenizerFactory="KeywordTokenizerFactory"/>
>
> 
> 
>
> it seems working fine while trying by analysis.jsp but not by url
> http://localhost:8080/solr/core0/select?q="what is solr user?"
> or
> http://localhost:8080/solr/core0/select?q=what is solr user?
>
> Please guide me for achieve desire result.
>
>
>
>
>
>
> Hi Marco,
> thanks.
> yes my default search field is text_sync.
> I am getting result now but not as I expect.
> following is my synonym.txt
>
> what is bone cancer=>bone cancer
> what is bone cancer?=>bone cancer
> what is of bone cancer=>bone cancer
> what is symptom of bone cancer=>bone cancer
> what is symptoms of bone cancer=>bone cancer
>
> in above I am getting result of all synonym but not the last one "what is
> symptoms of bone cancer=>bone cancer".
> I think due to stemming I am not getting expected result. However when I am
> checking result from the analysis.jsp,
> its giving expected result. I am confused..
> Also I want to know best approach to configure synonym for my requirement.
>
> thanks
> with regards
>
> Hi,
>
> I am also facing same type of problem..
> I am Newbie please help.
>
> thanks
> Jonty
>


Re: synonym filter problem for string or phrase

2010-04-29 Thread Marco Martinez
Hi Ranveer,

If you don't specify a field type in the q parameter, the search will be
done searching in your default search field defined in the solrconfig.xml,
its your default field a text_sync field?

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/29 Ranveer 

> Hi,
>
> I am trying to configure synonym filter.
> my requirement is:
> when user searching by phrase like "what is solr user?" then it should be
> replace with "solr user".
> something like : what is solr user? => solr user
>
> My schema for particular field is:
>
>  positionIncrementGap="100">
> 
> 
> 
>
> 
> 
> 
> 
> 
>  ignoreCase="true" expand="true" tokenizerFactory="KeywordTokenizerFactory"/>
> 
> 
>
> it seems working fine while trying by analysis.jsp but not by url
> http://localhost:8080/solr/core0/select?q="what is solr user?"
> or
> http://localhost:8080/solr/core0/select?q=what is solr user?
>
> Please guide me for achieve desire result.
>
>


Re: Facet count problem

2010-04-19 Thread Marco Martinez
Hi Ranveer,

The error in the count of the facets its caused by the tokenized field that
you are using, if you want to do facets for the whole string, use a
fieldType that doesn't strip the the field in tokens like the string field.

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/19 Ranveer Kumar 

> Hi Erick,
>
> My schema configuration is following.
>
>
>  
>  
>
>
>
>
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>  
>  
>  
>  
>
>
>
>   
> ignoreCase="true" expand="true"/>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>  
>
>
>
> 
>
> 
>  
>
>
>
>
>
> On Mon, Apr 19, 2010 at 6:22 AM, Erick Erickson  >wrote:
>
> > Can we see the actual field definitions from your schema file.
> > Ahmet's question is vital and is best answered if you'll
> > copy/paste the relevant configuration entries But based
> > on what you *have* posted, I'd guess you're trying to
> > facet on tokenized fields, which is not recommended.
> >
> > You might take a look at:
> > http://wiki.apache.org/solr/UsingMailingLists, it'll help you
> > frame your questions in a manner that gets you your
> > answers as fast as possibld.
> >
> > Best
> > Erick
> >
> > On Sun, Apr 18, 2010 at 12:59 PM, Ranveer Kumar  > >wrote:
> >
> > > I am.using text for type, which is static. For example: type is a field
> > and
> > > I am using type for categorization. For news type I am using news and
> for
> > > blog using blog.. type is a text field.
> > >
> > > On Apr 17, 2010 8:38 PM, "Ahmet Arslan"  wrote:
> > >
> > > > I am facing problem to get facet result count. I must be > wrong
> > > somewhere. > I am getting proper ...
> > > Are you faceting on a tokenized field? What is the fieldType of your
> > field?
> > >
> >
>


Re: Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Marco Martinez
Hi Marcin,

This is because when you do the replication, all the caches are rebuild
cause the index has changed, so the searchs performance decrease. You can
change your architecture to a multicore one to reduce the impact of the
replication. Using two cores, one to do the replication, and other to
search, when the replication is done, do a swap of the cores so the caches
are updated all the time.

Regards


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/9 Marcin 

> Hi guys,
>
> I have noticed that Master/Slave replication process is slowing down slave
> read/search performance during replication being done.
>
>
> please help
> cheers
>


Re: Solr query parser doesn't invoke analyzer for simple term query?

2010-03-17 Thread Marco Martinez
Hello,

You can see what happen (which analyzer are used for this field and which is
the output of the analyzers) with this search using the analysis page of the
solr default web page. I assume you are using the same analyzers and
tokenizers in indexing and searching for this field in your schema.

Regards,


Marco Martínez Bautista



2010/3/17 Teruhiko Kurosaka 

> It seems that Solr's query parser doesn't pass a single term query
> to the Analyzer for the field. For example, if I give it
> 2001年 (year 2001 in Japanese), the searcher returns 0 hits
> but if I quote them with double-quotes, it returns hits.
> In this experiment, I configured schema.xml so that
> the field in question will use the morphological Analyzer
> my company makes that is capable of splitting 2001年
> into two tokens 2001 and 年.  I am guessing that this
> Analyzer is called ONLY IF the term is a phrase.
> Is my observation correct?
>
> If so, is there any configuration parameter that I can tweak
> to force any query for the text fields be processed by
> the Analyzer?
>
> One might ask why users won't put space between 2001 and 年.
> Well if they are clearly two separate words, people do that.
> But 年 works more like a suffix in this case, and in many
> Japanese speaker's mind, 2001年 seems like one token, so
> many people won't.  (Remember Japanese don't use spaces
> in normal writing.)  Forcing to use Analyzer would also
> be useful for compound word handling often desirable
> for languages like German.
>
> 
> Teruhiko "Kuro" Kurosaka
> RLP + Lucene & Solr = powerful search for global contents
>
>