Re: Regarding Solr Faceting on the query response.

2014-01-31 Thread Jérôme Étévé
On 30 January 2014 17:35, Kuchekar  wrote:
> Hi Mikhail,
>
>  I would like my faceting to run only on my resultset
> returned as in only on numFound, rather than the whole index.

As far as I know, unless you define filter tagging and exclusion, this
is the default facet behaviour.

Are you sure no such things are defined?

Can you send your full solr query?

J.


> In the example, even when I specify the query 'company:Apple' .. it gives
> me faceted results for other companies. This means that it is querying
> against the whole index, rather than just the result set.
>
> Using facet.mincount=1 will give me faceted values which are greater than
> 1, but that will again to retrieve all the distinct values (Apple, Bose,
> Chevron, ..Oracle..) of facet field (company) query the whole index.
>
> What I would like to do is ... facet only on the resultset.
>
> i.e. my query (q= company:Apple AND technologies:java ) should return, only
> the facet details about 'Apple' since that is only present in the results
> set. But it provides me the list of other Company Names ... which makes me
> believe that it is querying the whole index to get the distinct value for
> the company..
>
> "docs": [ { "id": "ABC123", "company": [ "APPLE" ] },
> { "id": "ABC1234", "company": [ "APPLE" ] },
> { "id": "ABC1235", "company": [ "APPLE" ] },
> { "id": "ABC1236", "company": [ "APPLE" ] } ] }, "facet_counts": { "
> facet_queries": { "p_company:ucsf\n": 1 }, "facet_fields": { "company": [
> "APPLE", 4, ] }, "facet_dates": {}, "facet_ranges": {} }
>
>
>  Thanks.
> Kuchekar, Nilesh
>
>
> On Thu, Jan 30, 2014 at 2:13 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> Hello
>> Do you mean setting
>> http://wiki.apache.org/solr/SimpleFacetParameters#facet.mincount to 1 or
>> you want to facet only returned page (rows) instead of full resultset
>> (numFound) ?
>>
>>
>> On Thu, Jan 30, 2014 at 6:24 AM, Nilesh Kuchekar
>> wrote:
>>
>> > Yeah it's a typo... I meant company:Apple
>> >
>> > Thanks
>> > Nilesh
>> >
>> > > On Jan 29, 2014, at 8:59 PM, Alexandre Rafalovitch > >
>> > wrote:
>> > >
>> > >> On Thu, Jan 30, 2014 at 3:43 AM, Kuchekar 
>> > wrote:
>> > >> company=Apple
>> > > Did you mean company:Apple ?
>> > >
>> > > Otherwise, that could be the issue.
>> > >
>> > > Regards,
>> > >   Alex.
>> > >
>> > >
>> > > Personal website: http://www.outerthoughts.com/
>> > > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> > > - Time is the quality of nature that keeps events from happening all
>> > > at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> > > book)
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> 
>>  
>>



-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


Re: facet.maxcount ?

2013-07-23 Thread Jérôme Étévé
Thanks!


On 23 July 2013 12:19, Markus Jelsma  wrote:
> Eeh, here's the other one: https://issues.apache.org/jira/browse/SOLR-1712
>
>
> -Original message-
>> From:Markus Jelsma 
>> Sent: Tuesday 23rd July 2013 13:18
>> To: solr-user@lucene.apache.org
>> Subject: RE: facet.maxcount ?
>>
>> Hi - No but there are two unresolved issues about this topic:
>> https://issues.apache.org/jira/browse/SOLR-4411
>> https://issues.apache.org/jira/browse/SOLR-4411
>>
>> Cheers
>>
>> -Original message-
>> > From:Jérôme Étévé 
>> > Sent: Tuesday 23rd July 2013 12:58
>> > To: solr-user@lucene.apache.org
>> > Subject: facet.maxcount ?
>> >
>> > Hi all happy Solr users!
>> >
>> > I was wondering if it's possible to have some sort of facet.maxcount 
>> > equivalent?
>> >
>> > In short, that would exclude from the facet any term (or query) that
>> > matches at least facet.maxcount times.
>> >
>> > That facet.maxcount would probably significantly improve the
>> > performance of request of the type:
>> >
>> > I want the facet values with zero result.
>> >
>> > Is it a mad idea or does it make some sort of sense?
>> >
>> > Cheers,
>> >
>> > Jerome.
>> >
>> > --
>> > Jerome Eteve
>> > +44(0)7738864546
>> > http://www.eteve.net/
>> >
>>



-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


facet.maxcount ?

2013-07-23 Thread Jérôme Étévé
Hi all happy Solr users!

I was wondering if it's possible to have some sort of facet.maxcount equivalent?

In short, that would exclude from the facet any term (or query) that
matches at least facet.maxcount times.

That facet.maxcount would probably significantly improve the
performance of request of the type:

I want the facet values with zero result.

Is it a mad idea or does it make some sort of sense?

Cheers,

Jerome.

-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


Re: Querying only for "+" character causes org.apache.lucene.queryParser.ParseException

2013-04-23 Thread Jérôme Étévé
If you want to allow your users to search for '+' , you also define your
'+' as being a regular ALPHA characters:

In config:

delimiter_types.txt:

#
# We let +, # and * be part of normal words.
# This is to let c++, c#, c* and R&D as words.
#
+ => ALPHA
 # => ALPHA
* => ALPHA
& => ALPHA
@ => ALPHA

Then in your solr.WordDelimiterFilterFactory,
use types="delimiter_types.txt"


You'll then be able to let your users search for + as part of a word.

If you want to allow them to search for just '+' , a little hacking is
necessary in your client code. Personally, I just  double quote the query
if it's only one char length. Can't be harmful and as it will turn your
single + into "+" , it will be considered as a token (rather than being
part of the query syntax) by the parser.

Providing you're using the edismax parser, it should be just fine for any
other queries, like '+ foo' , 'foo +', '++' ...


J.


On 23 April 2013 15:09, Jorge Luis Betancourt Gonzalez
wrote:

> Hi Kai:
>
> Thanks for your reply, for what I've understood this logic must be
> included in my application, It would be possible to, for instance, use some
> regular expression at querying time in my schema to avoid a query that
> contains only this characters? for instance + and + would be a good
> catch to avoid.
>
> Thanks in advance!
>
> - Mensaje original -
> De: "Kai Becker" 
> Para: solr-user@lucene.apache.org
> Enviados: Martes, 23 de Abril 2013 9:48:26
> Asunto: Re: Querying only for "+" character causes
> org.apache.lucene.queryParser.ParseException
>
> Hi,
>
> you need to escape that char in search terms.
> Special chars are + - ! ( ) { } [ ] ^ " ~ * ? : \ / at the moment.
>
> The %2B is just the url encoding, but it will still be a + for Solr, so
> just put a \ in front of the chars I mentioned.
>
> Cheers,
> Kai
>
> Am 23.04.2013 um 15:41 schrieb Jorge Luis Betancourt Gonzalez:
>
> > Hi!
> >
> > Currently I'm working on a basica search engine for, the main problem is
> that during some tests a problem was detected, in the application if a user
> search for the "+" or "-" term only or the "+" string it causes an
> exception in my application, the problem is caused for an
> org.apache.lucene.queryParser.ParseException in solr. I get the same
> response if, from the solr admin interface, I search for the + term. For
> what I've seen the "+" character gets encoded into "%2B" which cause the
> exception. Is there any way of escaping this character so they behave like
> any other character? or at least get no response for this cases?
> >
> > I'm using solr 3.6.2, deployed in tomcat7.
> >
> > Greetings!
> > http://www.uci.cu
>
> http://www.uci.cu
> http://www.uci.cu
>



-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


Re: Memory Impact of a New Core

2013-04-23 Thread Jérôme Étévé
Thanks!

Yeah I know about the caching/commit things

My question is more about the impact of a "Pure" creation of a Solr core,
indepently of its "usage" memory requirements (like caches and stuff).

>From the experiments I did using JMX, it's not measurable, but I might be
wrong.



On 23 April 2013 12:25, Guido Medina  wrote:

> I'm not an expert, but at some extent I think it will come down to few
> factors:
>
>  * How much data is been cached per core.
>  * If memory is an issue and still you want performance, I/O with low
>cache could be an issue (SSDs?)
>  * Soft commits which implies open searchers per soft commit (and per
>core) will depend on caches.
>
> I do believe at the end it will be a direct result of your caching and
> I/O. If all you care is performance, caching (memory) could be replaced
> with faster I/O though soft commits will be fragile to memory due to its
> nature (depends on caching/memory and low I/O usage)
>
> Hope I made sense, I probably tried too many points of view in a single
> idea.
>
> Guido.
>
>
> On 23/04/13 11:50, Jérôme Étévé wrote:
>
>> Hi all,
>>
>> We've got quite a lot of (mostly small) solr cores in our Solr instance.
>> They all share the same solrconfig.xml and schema.xml (only the data
>> differs).
>>
>> I'm wondering how far can I go in terms of number of cores. CPU is not an
>> issue, but memory could be.
>>
>> An idea/guideline about the impact of a new Solr Core in a Solr instance?
>>
>> Thanks!
>>
>> Jerome.
>>
>>
>


-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


Memory Impact of a New Core

2013-04-23 Thread Jérôme Étévé
Hi all,

We've got quite a lot of (mostly small) solr cores in our Solr instance.
They all share the same solrconfig.xml and schema.xml (only the data
differs).

I'm wondering how far can I go in terms of number of cores. CPU is not an
issue, but memory could be.

An idea/guideline about the impact of a new Solr Core in a Solr instance?

Thanks!

Jerome.

-- 
Jerome Eteve
+44(0)7738864546
http://www.eteve.net/


multicore file creation order

2011-06-27 Thread Jérôme Étévé
Hi,

When one issues a command
admin/core&action=CREATE&core=blabla&instanceDir=...&dataDir=../../foobar
, what gets created first on disk?

Is it the new solr.xml or the new data directory?

Cheers,

Jerome.

-- 
Jerome Eteve.

http://sigstp.blogspot.com/
http://twitter.com/jeteve