Re: Solr groups not matching with terms in a field

2015-01-15 Thread Ahmet Arslan
Hi Naresh,

I have never grouped on a tokenised field and I am not sure it makes sense to 
do so.

Reading back ref-guide it says this about group.field parameter

"The name of the field by which to group results. The field must be 
single-valued, and either be indexed or a field type that has a value source 
and works in a function query, such as ExternalFileField. It must also be a 
string-based field, such as StrField or TextField"


https://cwiki.apache.org/confluence/display/solr/Result+Grouping

Therefore, it should be single valued. P.S. Don't get confused with TextField 
type, for example it could create single token when used with keyword tokenizer.

Ahmet

On Friday, January 16, 2015 4:43 AM, Naresh Yadav  wrote:
Hi ahmet,

If you observe output ngroups is 1 and returning only one group P1.
But my expectation is it should return three groups P1, L1, L2 as my
field is tokenized with space.

Please correct me if wrong?


On 1/15/15, Ahmet Arslan  wrote:
>
>
> Hi Naresh,
>
> Everything looks correct, what is the problem here?
>
> If you want to see more than one document per group, there is a parameter
> for that which defaults to 1.
>
> Ahmet
>
>
>
> On Thursday, January 15, 2015 9:02 AM, Naresh Yadav 
> wrote:
> Hi all,
>
> I had done following configuration to test Solr grouping concept.
>
> solr version :  4.6.1 (tried in latest version 4.10.3 also)
> Schema : http://www.imagesup.net/?di=10142124357616
> Solrj code to insert docs :http://www.imagesup.net/?di=10142124381116
> Response Group's :  http://www.imagesup.net/?di=1114212438351
> Response Terms' : http://www.imagesup.net/?di=614212438580
>
> Please let me know if am i doing something wrong her


Re: SolrCloud - Enable SSL

2015-01-15 Thread Hrishikesh Gadre
OK. I think I have figured this out.

https://issues.apache.org/jira/browse/SOLR-5610

On Thu, Jan 15, 2015 at 6:00 PM, Hrishikesh Gadre 
wrote:

> Hi,
>
> If we need to enable SSL configuration for an existing Solr cluster
> (hosting one or more collections), do we need to manually update the
> clusterstate.json file? Or is there any API available which would serve the
> purpose?
>
> As per the Solr wiki, we need to set the urlScheme property to https
>
>
> https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-SolrCloud
>
> Thanks
> Hrishikesh
>
>


Snippets sorting in SOLR is not working correctly

2015-01-15 Thread Behzad Qureshi
Hi,

I have posted a question

on stack exchange related to highlighted snippets sorting.
Multiple snippets returned against single document are not in sorted order.

Thanks in advance.

-- 

Regards,

Behzad Qureshi


How to select the correct number of Shards in SolrCloud

2015-01-15 Thread Manohar Sripada
Hi All,

My Setup is as follows. There are 16 nodes in my SolrCloud and 4 CPU cores
on each Solr Node VM. Each having 64 GB of RAM, out of which I have
allocated 32 GB to Solr. I have a collection which contains around 100
million Docs, which I created with 64 shards, replication factor 2, and 8
shards per node. Each shard is getting around 1.6 Million Documents.

The reason I have created 64 Shards is there are 4 CPU cores on each VM;
while querying I can make use of all the CPU cores. On an average, Solr
QTime is around 500ms here.

Last time to my other discussion, Erick suggested that I might be over
sharding, So, I tried reducing the number of shards to 32 and then 16. To
my surprise, it started performing better. It came down to 300 ms (for 32
shards) and 100 ms (for 16 shards). I haven't tested with filters and
facets yet here. But, the simple search queries had shown lot of
improvement.

So, how come the less number of shards performing better?? Is it because
there are less number of posting lists to search on OR less merges that are
happening? And how to determine the correct number of shards?

Thanks,
Manohar


Re: OutOfMemoryError for PDF document upload into Solr

2015-01-15 Thread Dan Davis
Why re-write all the document conversion in Java ;)  Tika is very slow.   5
GB PDF is very big.

If you have a lot of PDF like that try pdftotext in HTML and UTF-8 output
mode.   The HTML mode captures some meta-data that would otherwise be lost.


If you need to go faster still, you can  also write some stuff linked
directly against poppler library.

Before you jump down by through about Tika being slow - I wrote a PDF
indexer that ran at 36 MB/s per core.   Different indexer, all C, lots of
getjmp/longjmp.   But fast...



On Thu, Jan 15, 2015 at 1:54 PM,  wrote:

> Siegfried and Michael Thank you for your replies and help.
>
> -Original Message-
> From: Siegfried Goeschl [mailto:sgoes...@gmx.at]
> Sent: Thursday, January 15, 2015 3:45 AM
> To: solr-user@lucene.apache.org
> Subject: Re: OutOfMemoryError for PDF document upload into Solr
>
> Hi Ganesh,
>
> you can increase the heap size but parsing a 4 GB PDF document will very
> likely consume A LOT OF memory - I think you need to check if that large
> PDF can be parsed at all :-)
>
> Cheers,
>
> Siegfried Goeschl
>
> On 14.01.15 18:04, Michael Della Bitta wrote:
> > Yep, you'll have to increase the heap size for your Tomcat container.
> >
> > http://stackoverflow.com/questions/6897476/tomcat-7-how-to-set-initial
> > -heap-size-correctly
> >
> > Michael Della Bitta
> >
> > Senior Software Engineer
> >
> > o: +1 646 532 3062
> >
> > appinions inc.
> >
> > “The Science of Influence Marketing”
> >
> > 18 East 41st Street
> >
> > New York, NY 10017
> >
> > t: @appinions  | g+:
> > plus.google.com/appinions
> >  > 3336/posts>
> > w: appinions.com 
> >
> > On Wed, Jan 14, 2015 at 12:00 PM,  wrote:
> >
> >> Hello,
> >>
> >> Can someone pass on the hints to get around following error? Is there
> >> any Heap Size parameter I can set in Tomcat or in Solr webApp that
> >> gets deployed in Solr?
> >>
> >> I am running Solr webapp inside Tomcat on my local machine which has
> >> RAM of 12 GB. I have PDF document which is 4 GB max in size that
> >> needs to be loaded into Solr
> >>
> >>
> >>
> >>
> >> Exception in thread "http-apr-8983-exec-6" java.lang.: Java heap
> space
> >>  at java.util.AbstractCollection.toArray(Unknown Source)
> >>  at java.util.ArrayList.(Unknown Source)
> >>  at
> >> org.apache.pdfbox.cos.COSDocument.getObjects(COSDocument.java:518)
> >>  at
> org.apache.pdfbox.cos.COSDocument.close(COSDocument.java:575)
> >>  at
> org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:254)
> >>  at
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1238)
> >>  at
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1203)
> >>  at
> org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:111)
> >>  at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> >>  at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> >>  at
> >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> >>  at
> >>
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
> >>  at
> >>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> >>  at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> >>  at
> >>
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:246)
> >>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
> >>  at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
> >>  at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> >>  at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> >>  at
> >>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> >>  at
> >>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> >>  at
> >>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
> >>  at
> >>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
> >>  at
> >>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
> >>  at
> >>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> >>  at
> >>
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
> >>  at
> >>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
> >>  at
> >>
> org.apache.catalina.connector.CoyoteAdapter

Re: Solr groups not matching with terms in a field

2015-01-15 Thread Naresh Yadav
Hi ahmet,

If you observe output ngroups is 1 and returning only one group P1.
But my expectation is it should return three groups P1, L1, L2 as my
field is tokenized with space.

Please correct me if wrong?

On 1/15/15, Ahmet Arslan  wrote:
>
>
> Hi Naresh,
>
> Everything looks correct, what is the problem here?
>
> If you want to see more than one document per group, there is a parameter
> for that which defaults to 1.
>
> Ahmet
>
>
>
> On Thursday, January 15, 2015 9:02 AM, Naresh Yadav 
> wrote:
> Hi all,
>
> I had done following configuration to test Solr grouping concept.
>
> solr version :  4.6.1 (tried in latest version 4.10.3 also)
> Schema : http://www.imagesup.net/?di=10142124357616
> Solrj code to insert docs :http://www.imagesup.net/?di=10142124381116
> Response Group's :  http://www.imagesup.net/?di=1114212438351
> Response Terms' : http://www.imagesup.net/?di=614212438580
>
> Please let me know if am i doing something wrong her


SolrCloud - Enable SSL

2015-01-15 Thread Hrishikesh Gadre
Hi,

If we need to enable SSL configuration for an existing Solr cluster
(hosting one or more collections), do we need to manually update the
clusterstate.json file? Or is there any API available which would serve the
purpose?

As per the Solr wiki, we need to set the urlScheme property to https

https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-SolrCloud

Thanks
Hrishikesh


Does DocValues improve Grouping performance ?

2015-01-15 Thread Shamik Bandopadhyay
Hi,

   Does use of DocValues provide any performance improvement for Grouping ?
I' looked into the blog which mentions improving Grouping performance
through DocValues.

https://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/

Right now, Group by queries (which I can't sadly avoid) has become a huge
bottleneck. It has an overhead of 60-70% compared to the same query san
group by. Unfortunately, I'm not able to be CollapsingQParserPlugin as it
doesn't have a support similar to "group.facet" feature.

My understanding on DocValues is that it's intended for faceting and
sorting. Just wondering if anyone have tried DocValues for Grouping and saw
any improvements ?

-Thanks,
Shamik


Re: Collection shard name

2015-01-15 Thread Erick Erickson
By definition, all replicas in a shard should be identical. So Solr is
doing exactly what I'd expect since you've created two nodes each
belonging to shard1 because of this parameter: "shard=shard1" so
updates will go to both exactly as they should. The name parameter
will allow you to distinguish between separates replicas of the _same_
shard.

I'd actually recommend that you don't use the admin/cores API at all
for collections and use the collections API instead, see:
https://cwiki.apache.org/confluence/display/solr/Collections+API. Note
that in the upper-left you can download the full PDF for 4.10.

Best,
Erick

On Thu, Jan 15, 2015 at 12:12 PM, kuttan palliyalil
 wrote:
> Erick & Shawn,
> I am using Solr 4.10.2.
> Here is the create command,  keeping the shard= same and changing the 
> name=
> http://''/solr/admin/cores?action=CREATE&name=shard1_1&collection=collectionName&shard=shard1&collection.configName=collconf'http://''/solr/admin/cores?action=CREATE&name=shard1_2&collection=collectionName&shard=shard1&collection.configName=collconf'
> RegardsRajesh
>
>  On Thursday, January 15, 2015 3:01 PM, Shawn Heisey 
>  wrote:
>
>
>  On 1/15/2015 12:24 PM, Erick Erickson wrote:
>> How do you get all the shards named identically in the first place?
>
> What I have always heard is that this is what Solr 4.0.0 did when
> creating a collection -- all the cores ended up with the same name as
> the collection.  I have never used 4.0.0, so I cannot claim any
> first-hand knowledge.  Supporting more than one shard per node would not
> be possible with that naming method.
>
> Thanks,
> Shawn
>
>
>
>


Re: Collection shard name

2015-01-15 Thread Shawn Heisey
On 1/15/2015 12:59 PM, Shawn Heisey wrote:
> On 1/15/2015 12:24 PM, Erick Erickson wrote:
>> How do you get all the shards named identically in the first place?
> What I have always heard is that this is what Solr 4.0.0 did when
> creating a collection -- all the cores ended up with the same name as
> the collection.  I have never used 4.0.0, so I cannot claim any
> first-hand knowledge.  Supporting more than one shard per node would not
> be possible with that naming method.

One other possibility, and have no first-hand experience with this
either, is this might happen when converting from non-cloud to cloud
using the bootstrap_conf system property.

Thanks,
Shawn



Re: Conditions in function query

2015-01-15 Thread shamik
This one worked.

if(termfreq(Source,'A'),sum(Likes,3),if(termfreq(Source,'B'),sum(Likes,3),0))



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Conditions-in-Boost-function-query-tp4179687p4179906.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Collection shard name

2015-01-15 Thread kuttan palliyalil
Erick & Shawn, 
I am using Solr 4.10.2. 
Here is the create command,  keeping the shard= same and changing the 
name= 
http://''/solr/admin/cores?action=CREATE&name=shard1_1&collection=collectionName&shard=shard1&collection.configName=collconf'http://''/solr/admin/cores?action=CREATE&name=shard1_2&collection=collectionName&shard=shard1&collection.configName=collconf'
RegardsRajesh 

 On Thursday, January 15, 2015 3:01 PM, Shawn Heisey  
wrote:
   

 On 1/15/2015 12:24 PM, Erick Erickson wrote:
> How do you get all the shards named identically in the first place?

What I have always heard is that this is what Solr 4.0.0 did when
creating a collection -- all the cores ended up with the same name as
the collection.  I have never used 4.0.0, so I cannot claim any
first-hand knowledge.  Supporting more than one shard per node would not
be possible with that naming method.

Thanks,
Shawn



   

Re: Collection shard name

2015-01-15 Thread Shawn Heisey
On 1/15/2015 12:24 PM, Erick Erickson wrote:
> How do you get all the shards named identically in the first place?

What I have always heard is that this is what Solr 4.0.0 did when
creating a collection -- all the cores ended up with the same name as
the collection.  I have never used 4.0.0, so I cannot claim any
first-hand knowledge.  Supporting more than one shard per node would not
be possible with that naming method.

Thanks,
Shawn



Re: Collection shard name

2015-01-15 Thread Erick Erickson
How do you get all the shards named identically in the first place?

Best
Erick

On Thu, Jan 15, 2015 at 8:49 AM, kuttan palliyalil
 wrote:
> When  the shard names in a collection is same across the nodes, then posting 
> to the collection the data gets posted to all the shards instead of 
> distributing them . i.e. all the shards have the same data similar to 
> replica. Is this expected ?
> RegardsRajesh
>
>


RE: OutOfMemoryError for PDF document upload into Solr

2015-01-15 Thread Ganesh.Yadav
Siegfried and Michael Thank you for your replies and help.

-Original Message-
From: Siegfried Goeschl [mailto:sgoes...@gmx.at] 
Sent: Thursday, January 15, 2015 3:45 AM
To: solr-user@lucene.apache.org
Subject: Re: OutOfMemoryError for PDF document upload into Solr

Hi Ganesh,

you can increase the heap size but parsing a 4 GB PDF document will very likely 
consume A LOT OF memory - I think you need to check if that large PDF can be 
parsed at all :-)

Cheers,

Siegfried Goeschl

On 14.01.15 18:04, Michael Della Bitta wrote:
> Yep, you'll have to increase the heap size for your Tomcat container.
>
> http://stackoverflow.com/questions/6897476/tomcat-7-how-to-set-initial
> -heap-size-correctly
>
> Michael Della Bitta
>
> Senior Software Engineer
>
> o: +1 646 532 3062
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions  | g+:
> plus.google.com/appinions
>  3336/posts>
> w: appinions.com 
>
> On Wed, Jan 14, 2015 at 12:00 PM,  wrote:
>
>> Hello,
>>
>> Can someone pass on the hints to get around following error? Is there 
>> any Heap Size parameter I can set in Tomcat or in Solr webApp that 
>> gets deployed in Solr?
>>
>> I am running Solr webapp inside Tomcat on my local machine which has 
>> RAM of 12 GB. I have PDF document which is 4 GB max in size that 
>> needs to be loaded into Solr
>>
>>
>>
>>
>> Exception in thread "http-apr-8983-exec-6" java.lang.: Java heap space
>>  at java.util.AbstractCollection.toArray(Unknown Source)
>>  at java.util.ArrayList.(Unknown Source)
>>  at
>> org.apache.pdfbox.cos.COSDocument.getObjects(COSDocument.java:518)
>>  at org.apache.pdfbox.cos.COSDocument.close(COSDocument.java:575)
>>  at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:254)
>>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1238)
>>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1203)
>>  at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:111)
>>  at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>>  at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>>  at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>>  at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
>>  at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>  at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>>  at
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:246)
>>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>>  at
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
>>  at
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
>>  at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
>>  at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
>>  at
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
>>  at
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
>>  at
>> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
>>  at
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
>>  at
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
>>  at
>> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070)
>>  at
>> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
>>  at
>> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2462)
>>  at
>> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoin
>> t.java:2451)
>>
>>
>> Thanks
>> Ganesh
>>
>>
>



Re: Occasionally getting error in solr suggester component.

2015-01-15 Thread Michael Sokolov
That sounds like a good approach to me.  Of course it depends how often 
you commit, and what your tolerance is for delay in having suggestions 
appear, but it sounds as if you have a good understanding of the 
tradeoffs there.


-Mike

On 1/15/15 10:31 AM, Dhanesh Radhakrishnan wrote:

Hi,
 From Solr 4.7 onwards, the implementation of this Suggester is changed. The
old SpellChecker based search component is replaced with a new suggester
that utilizes Lucene suggester module. The latest Solr download is
preconfigured with this new suggester
I;m using Solr 4.10 and suggestion are based on query  /suggest instead of
/spell.
So what I did is that in changed to false
Its not good that each time rebuild the index on  commit , however, I
would like to build
the index on certain time period, say 1 hour.
The lookup data will be built only when requested by URL parameter
suggest.build=true

"http://localhost:8983/solr/ha/suggest?suggest.build=true";

So this will rebuild the index again and the changes will reflect in the
suggester.

There are certain pros and cons for this.
Issue is that the change will reflect only on certain time interval, here 1
hour. Advantage is that we can avoid the  rebuilt index  on every commit or
optimize.

Is this the right way ?? or any that I missed ???

Regards
dhanesh s.r




On Thu, Jan 15, 2015 at 3:20 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:


did you build the spellcheck index using spellcheck.build as described
here: https://cwiki.apache.org/confluence/display/solr/Spell+Checking ?

-Mike


On 01/14/2015 07:19 AM, Dhanesh Radhakrishnan wrote:


Hi,
Thanks for the reply.
As you mentioned in the previous mail I changed buildOnCommit=false in
solrConfig.
After that change, suggestions are not working.
In Solr 4.7 introduced a new approach based on a dedicated
SuggestComponent
I'm using that component to build suggestions and lookup implementation is
"AnalyzingInfixLookupFactory"
Is there any work around ??




On Wed, Jan 14, 2015 at 12:47 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

  I think you are probably getting bitten by one of the issues addressed in

LUCENE-5889

I would recommend against using buildOnCommit=true - with a large index
this can be a performance-killer.  Instead, build the index yourself
using
the Solr spellchecker support (spellcheck.build=true)

-Mike


On 01/13/2015 10:41 AM, Dhanesh Radhakrishnan wrote:

  Hi all,

I am experiencing a problem in Solr SuggestComponent
Occasionally solr suggester component throws an  error like

Solr failed:
{"responseHeader":{"status":500,"QTime":1},"error":{"msg":"suggester
was
not built","trace":"java.lang.IllegalStateException: suggester was not
built\n\tat
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.
lookup(AnalyzingInfixSuggester.java:368)\n\tat
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.
lookup(AnalyzingInfixSuggester.java:342)\n\tat
org.apache.lucene.search.suggest.Lookup.lookup(Lookup.java:240)\n\tat
org.apache.solr.spelling.suggest.SolrSuggester.
getSuggestions(SolrSuggester.java:199)\n\tat
org.apache.solr.handler.component.SuggestComponent.
process(SuggestComponent.java:234)\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(
SearchHandler.java:218)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(
RequestHandlerBase.java:135)\n\tat
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.
handleRequest(RequestHandlers.java:246)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(
SolrDispatchFilter.java:777)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:418)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:207)\n\tat
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
ApplicationFilterChain.java:243)\n\tat
org.apache.catalina.core.ApplicationFilterChain.doFilter(
ApplicationFilterChain.java:210)\n\tat
org.apache.catalina.core.StandardWrapperValve.invoke(
StandardWrapperValve.java:225)\n\tat
org.apache.catalina.core.StandardContextValve.invoke(
StandardContextValve.java:123)\n\tat
org.apache.catalina.core.StandardHostValve.invoke(
StandardHostValve.java:168)\n\tat
org.apache.catalina.valves.ErrorReportValve.invoke(
ErrorReportValve.java:98)\n\tat
org.apache.catalina.valves.AccessLogValve.invoke(
AccessLogValve.java:927)\n\tat
org.apache.catalina.valves.RemoteIpValve.invoke(
RemoteIpValve.java:680)\n\tat
org.apache.catalina.core.StandardEngineValve.invoke(
StandardEngineValve.java:118)\n\tat
org.apache.catalina.connector.CoyoteAdapter.service(
CoyoteAdapter.java:407)\n\tat
org.apache.coyote.http11.AbstractHttp11Processor.process(
AbstractHttp11Processor.java:1002)\n\tat
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.
process(AbstractProtocol.java:579)\n\tat
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.
run(JIoEndpoi

Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Shawn Heisey
On 1/15/2015 8:06 AM, Robert Krüger wrote:
> I was considering the programmatic Jetty option but then I read that Solr 5
> no longer supports being run with an external servlet container but maybe
> they still support programmatic jetty use in some way. atm I am using solr
> 4.x, so this would work. No idea if this gets messy classloader-wise in any
> way.
>
> I have been using exactly the approach you described in the past, i.e. I
> built a really, really simple swing dialogue to input queries and display
> results in a table but was just guessing that the built-in ui was far
> superior but maybe I should just live with it for the time being.

Right now with 5.0 the distinction you're talking about is semantics. 
We will no longer *ship* a war ... but for the immediate future, when
you run Solr, you will still be running jetty, which will then run Solr
as a webapp.  It might still be possible to build a war using the source
code, at least for the immediate future.  I don't know if there will be
a war that you can find within the binary Solr 5.0 download ... I
haven't been involved with the packaging.  The webapp might be fully
exploded rather than packaged in a war.

The bin/solr script that we are using now will handle all the details of
finding and running jetty.  At some point, hopefully soon, Solr will
actually own the network layer, so you (or the bin/solr script) will
actually start Solr as an application.  When we reach that point, Jetty
might be the transport method, but if that's the case, it will be fully
embedded inside Solr.  Currently the Jetty bits in the Solr download are
100% identical to what you can download from eclipse.org.

Thanks,
Shawn



Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Alexandre Rafalovitch
On 15 January 2015 at 09:53, Ahmet Arslan  wrote:
> http://sourcesense.github.io/solr-packager/

Does this work with modern Solr? Seems to be 4-year-old project with
no recent update. Even the parent company seems 'quiet'. But looks
interesting in general.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


Collection shard name

2015-01-15 Thread kuttan palliyalil
When  the shard names in a collection is same across the nodes, then posting to 
the collection the data gets posted to all the shards instead of distributing 
them . i.e. all the shards have the same data similar to replica. Is this 
expected ?  
RegardsRajesh

 


Re: Occasionally getting error in solr suggester component.

2015-01-15 Thread Dhanesh Radhakrishnan
Hi,
>From Solr 4.7 onwards, the implementation of this Suggester is changed. The
old SpellChecker based search component is replaced with a new suggester
that utilizes Lucene suggester module. The latest Solr download is
preconfigured with this new suggester
I;m using Solr 4.10 and suggestion are based on query  /suggest instead of
/spell.
So what I did is that in changed to false
Its not good that each time rebuild the index on  commit , however, I
would like to build
the index on certain time period, say 1 hour.
The lookup data will be built only when requested by URL parameter
suggest.build=true

"http://localhost:8983/solr/ha/suggest?suggest.build=true";

So this will rebuild the index again and the changes will reflect in the
suggester.

There are certain pros and cons for this.
Issue is that the change will reflect only on certain time interval, here 1
hour. Advantage is that we can avoid the  rebuilt index  on every commit or
optimize.

Is this the right way ?? or any that I missed ???

Regards
dhanesh s.r




On Thu, Jan 15, 2015 at 3:20 AM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> did you build the spellcheck index using spellcheck.build as described
> here: https://cwiki.apache.org/confluence/display/solr/Spell+Checking ?
>
> -Mike
>
>
> On 01/14/2015 07:19 AM, Dhanesh Radhakrishnan wrote:
>
>> Hi,
>> Thanks for the reply.
>> As you mentioned in the previous mail I changed buildOnCommit=false in
>> solrConfig.
>> After that change, suggestions are not working.
>> In Solr 4.7 introduced a new approach based on a dedicated
>> SuggestComponent
>> I'm using that component to build suggestions and lookup implementation is
>> "AnalyzingInfixLookupFactory"
>> Is there any work around ??
>>
>>
>>
>>
>> On Wed, Jan 14, 2015 at 12:47 AM, Michael Sokolov <
>> msoko...@safaribooksonline.com> wrote:
>>
>>  I think you are probably getting bitten by one of the issues addressed in
>>> LUCENE-5889
>>>
>>> I would recommend against using buildOnCommit=true - with a large index
>>> this can be a performance-killer.  Instead, build the index yourself
>>> using
>>> the Solr spellchecker support (spellcheck.build=true)
>>>
>>> -Mike
>>>
>>>
>>> On 01/13/2015 10:41 AM, Dhanesh Radhakrishnan wrote:
>>>
>>>  Hi all,

 I am experiencing a problem in Solr SuggestComponent
 Occasionally solr suggester component throws an  error like

 Solr failed:
 {"responseHeader":{"status":500,"QTime":1},"error":{"msg":"suggester
 was
 not built","trace":"java.lang.IllegalStateException: suggester was not
 built\n\tat
 org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.
 lookup(AnalyzingInfixSuggester.java:368)\n\tat
 org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.
 lookup(AnalyzingInfixSuggester.java:342)\n\tat
 org.apache.lucene.search.suggest.Lookup.lookup(Lookup.java:240)\n\tat
 org.apache.solr.spelling.suggest.SolrSuggester.
 getSuggestions(SolrSuggester.java:199)\n\tat
 org.apache.solr.handler.component.SuggestComponent.
 process(SuggestComponent.java:234)\n\tat
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(
 SearchHandler.java:218)\n\tat
 org.apache.solr.handler.RequestHandlerBase.handleRequest(
 RequestHandlerBase.java:135)\n\tat
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.
 handleRequest(RequestHandlers.java:246)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.execute(
 SolrDispatchFilter.java:777)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(
 SolrDispatchFilter.java:418)\n\tat
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(
 SolrDispatchFilter.java:207)\n\tat
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
 ApplicationFilterChain.java:243)\n\tat
 org.apache.catalina.core.ApplicationFilterChain.doFilter(
 ApplicationFilterChain.java:210)\n\tat
 org.apache.catalina.core.StandardWrapperValve.invoke(
 StandardWrapperValve.java:225)\n\tat
 org.apache.catalina.core.StandardContextValve.invoke(
 StandardContextValve.java:123)\n\tat
 org.apache.catalina.core.StandardHostValve.invoke(
 StandardHostValve.java:168)\n\tat
 org.apache.catalina.valves.ErrorReportValve.invoke(
 ErrorReportValve.java:98)\n\tat
 org.apache.catalina.valves.AccessLogValve.invoke(
 AccessLogValve.java:927)\n\tat
 org.apache.catalina.valves.RemoteIpValve.invoke(
 RemoteIpValve.java:680)\n\tat
 org.apache.catalina.core.StandardEngineValve.invoke(
 StandardEngineValve.java:118)\n\tat
 org.apache.catalina.connector.CoyoteAdapter.service(
 CoyoteAdapter.java:407)\n\tat
 org.apache.coyote.http11.AbstractHttp11Processor.process(
 AbstractHttp11Processor.java:1002)\n\tat
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.
 process(AbstractProtoc

DisMax search on field only if it exists otherwise fall-back to another

2015-01-15 Thread Neil Prosser
Hopefully this question makes sense.

At the moment I'm using a DisMax query which looks something like the
following (massively cut-down):

?defType=dismax
&q=some query
&qf=field_one^0.5 field_two^1.0

I've got some localisation work coming up where I'd like to use the value
of one, sparsely populated field if it exists, falling back to another if
it doesn't (rather than duplicating some default value for all territories).

Using the standard query parser I understand I can do the following to get
this behaviour:

?q=if(exist(field_one),(field_one:some query),(field_three:some query))

However, I don't know how I would go about using DisMax for this type of
fallback.

Has anyone tried to do this sort of thing before? Is there a way to use
functions within the qf parameter?


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Robert Krüger
I was considering the programmatic Jetty option but then I read that Solr 5
no longer supports being run with an external servlet container but maybe
they still support programmatic jetty use in some way. atm I am using solr
4.x, so this would work. No idea if this gets messy classloader-wise in any
way.

I have been using exactly the approach you described in the past, i.e. I
built a really, really simple swing dialogue to input queries and display
results in a table but was just guessing that the built-in ui was far
superior but maybe I should just live with it for the time being.

On Thu, Jan 15, 2015 at 3:56 PM, Erik Hatcher 
wrote:

> It’d certainly be easiest to just embed Jetty into your application.  You
> don’t need to have Jetty as a separate process, you could launch it through
> it’s friendly Java API, configured to use solr.war.
>
> If all you needed was to make HTTP(-like) queries to Solr instead of the
> full admin UI, your application could stick to using EmbeddedSolrServer and
> also provide a UI that takes in a Solr query string (or builds one up) and
> then sends it to the embedded Solr and displays the result.
>
> Erik
>
> > On Jan 15, 2015, at 9:44 AM, Robert Krüger  wrote:
> >
> > Hi Andrea,
> >
> > you are assuming correctly. It is a local, non-distributed index that is
> > only accessed by the containing desktop application. Do you know if there
> > is a possibility to run the Solr admin UI on top of an embedded instance
> > somehow?
> >
> > Thanks a lot,
> >
> > Robert
> >
> > On Thu, Jan 15, 2015 at 3:17 PM, Andrea Gazzarini  >
> > wrote:
> >
> >> Hi Robert,
> >> I've used the EmbeddedSolrServer in a scenario like that and I never had
> >> problems.
> >> I assume you're talking about a standalone application, where the whole
> >> index resides locally and you don't need any cluster / cloud /
> distributed
> >> feature.
> >>
> >> I think the usage of EmbeddedSolrServer is discouraged in a
> (distributed)
> >> service scenario, because it is a direct connection to a SolrCore
> >> instance...but this is not a problem in the situation you described (as
> far
> >> as I know)
> >>
> >> Best,
> >> Andrea
> >>
> >>
> >> On 01/15/2015 03:10 PM, Robert Krüger wrote:
> >>
> >>> Hi,
> >>>
> >>> I have been using an embedded instance of solr in my desktop
> application
> >>> for a long time and it works fine. At the time when I made that
> decision
> >>> (vs. firing up a solr web application within my swing application) I
> got
> >>> the impression embedded use is somewhat unsupported and I should expect
> >>> problems.
> >>>
> >>> My first question is, is this still the case now (4 years later), that
> >>> embedded solr is discouraged?
> >>>
> >>> The one limitation I am running into is that I cannot use the solr
> admin
> >>> UI
> >>> for debugging purposes (mainly for running queries). Is there any other
> >>> way
> >>> to do this other than no longer using embedded solr and
> programmatically
> >>> firing up a web application (e.g. using jetty)? Should I do the latter
> >>> anyway?
> >>>
> >>> Any insights/advice greatly appreciated.
> >>>
> >>> Best regards,
> >>>
> >>> Robert
> >>>
> >>>
> >>
> >
> >
> > --
> > Robert Krüger
> > Managing Partner
> > Lesspain GmbH & Co. KG
> >
> > www.lesspain-software.com
>
>


-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Shawn Heisey
On 1/15/2015 7:44 AM, Robert Krüger wrote:
> you are assuming correctly. It is a local, non-distributed index that
> is only accessed by the containing desktop application. Do you know
> if there is a possibility to run the Solr admin UI on top of an
> embedded instance somehow?

To have the admin UI, you must have a webserver for a browser to connect
to.  A servlet container (which is a webserver) is required to run Solr
in the traditional way.  If you run it embedded, you give that up.

One of the reasons given by users for running the embedded server is
that the user does not want to deal with the overhead of HTTP.  That
overhead should be fairly minimal on a LAN, especially if that LAN is
gigabit or faster.  A few milliseconds makes very little difference when
it comes to user experience.

Thanks,
Shawn


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Robert Krüger
Hi Ahmet,

at first glance, I'm not sure. Need to look at it more carefully.

Thanks,

Robert

On Thu, Jan 15, 2015 at 3:53 PM, Ahmet Arslan 
wrote:

> Hi Robert,
>
> Never used by myself but is solr-packager useful in your case?
>
> http://sourcesense.github.io/solr-packager/
>
> Ahmet
>
>
> On Thursday, January 15, 2015 4:45 PM, Robert Krüger 
> wrote:
> Hi Andrea,
>
> you are assuming correctly. It is a local, non-distributed index that is
> only accessed by the containing desktop application. Do you know if there
> is a possibility to run the Solr admin UI on top of an embedded instance
> somehow?
>
> Thanks a lot,
>
> Robert
>
> On Thu, Jan 15, 2015 at 3:17 PM, Andrea Gazzarini 
> wrote:
>
> > Hi Robert,
> > I've used the EmbeddedSolrServer in a scenario like that and I never had
> > problems.
> > I assume you're talking about a standalone application, where the whole
> > index resides locally and you don't need any cluster / cloud /
> distributed
> > feature.
> >
> > I think the usage of EmbeddedSolrServer is discouraged in a (distributed)
> > service scenario, because it is a direct connection to a SolrCore
> > instance...but this is not a problem in the situation you described (as
> far
> > as I know)
> >
> > Best,
> > Andrea
> >
> >
> > On 01/15/2015 03:10 PM, Robert Krüger wrote:
> >
> >> Hi,
> >>
> >> I have been using an embedded instance of solr in my desktop application
> >> for a long time and it works fine. At the time when I made that decision
> >> (vs. firing up a solr web application within my swing application) I got
> >> the impression embedded use is somewhat unsupported and I should expect
> >> problems.
> >>
> >> My first question is, is this still the case now (4 years later), that
> >> embedded solr is discouraged?
> >>
> >> The one limitation I am running into is that I cannot use the solr admin
> >> UI
> >> for debugging purposes (mainly for running queries). Is there any other
> >> way
> >> to do this other than no longer using embedded solr and programmatically
> >> firing up a web application (e.g. using jetty)? Should I do the latter
> >> anyway?
> >>
> >> Any insights/advice greatly appreciated.
> >>
> >> Best regards,
>
> >>
> >> Robert
> >>
> >>
> >
>
>
> --
> Robert Krüger
> Managing Partner
> Lesspain GmbH & Co. KG
>
> www.lesspain-software.com
>



-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


solr autosuggest to stop/filter suggesting the phrases that ends with stopwords

2015-01-15 Thread Rajesh Hazari
Hi Folks,

Solr Version 4.7+

Do we have any analyzer or filter or any plugin in solr to stop suggesting
the phrase that ends with stopwords?

For ex: If the suggestion are as below for query
http://localhost.com/solr/suggest?q=jazz+a

"suggestion": [
"jazz and",
"jazz at",
"jazz at lincoln",
"jazz at lincoln center",
"jazz artists",
"jazz and classic"
]

Is there any config or solution to remove only "*jazz at*" and "*jazz and*"
phrases so that the final suggestion response looks more sensible!

"suggestion": [
"jazz at lincoln",
"jazz at lincoln center",
"jazz artists",
"jazz and classic"
]

Google does this intelligently :)

I have tested with StopFilterFactory and SuggestStopFilter both of which
filters all of stop terms in the phrases now matter where they appear.

Do i have to come up with a custom plugin or some kind of phrase filter to
do this in solr?

I am on the way to design SuggestStopPhraseFilter and its factory , as we
have existing SuggestStopFilter, and use this in my schema

or do we have any existing plugin or feature that i can use of leverage
from?
*Thanks,*
*Rajesh.*


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Erik Hatcher
It’d certainly be easiest to just embed Jetty into your application.  You don’t 
need to have Jetty as a separate process, you could launch it through it’s 
friendly Java API, configured to use solr.war.

If all you needed was to make HTTP(-like) queries to Solr instead of the full 
admin UI, your application could stick to using EmbeddedSolrServer and also 
provide a UI that takes in a Solr query string (or builds one up) and then 
sends it to the embedded Solr and displays the result.

Erik

> On Jan 15, 2015, at 9:44 AM, Robert Krüger  wrote:
> 
> Hi Andrea,
> 
> you are assuming correctly. It is a local, non-distributed index that is
> only accessed by the containing desktop application. Do you know if there
> is a possibility to run the Solr admin UI on top of an embedded instance
> somehow?
> 
> Thanks a lot,
> 
> Robert
> 
> On Thu, Jan 15, 2015 at 3:17 PM, Andrea Gazzarini 
> wrote:
> 
>> Hi Robert,
>> I've used the EmbeddedSolrServer in a scenario like that and I never had
>> problems.
>> I assume you're talking about a standalone application, where the whole
>> index resides locally and you don't need any cluster / cloud / distributed
>> feature.
>> 
>> I think the usage of EmbeddedSolrServer is discouraged in a (distributed)
>> service scenario, because it is a direct connection to a SolrCore
>> instance...but this is not a problem in the situation you described (as far
>> as I know)
>> 
>> Best,
>> Andrea
>> 
>> 
>> On 01/15/2015 03:10 PM, Robert Krüger wrote:
>> 
>>> Hi,
>>> 
>>> I have been using an embedded instance of solr in my desktop application
>>> for a long time and it works fine. At the time when I made that decision
>>> (vs. firing up a solr web application within my swing application) I got
>>> the impression embedded use is somewhat unsupported and I should expect
>>> problems.
>>> 
>>> My first question is, is this still the case now (4 years later), that
>>> embedded solr is discouraged?
>>> 
>>> The one limitation I am running into is that I cannot use the solr admin
>>> UI
>>> for debugging purposes (mainly for running queries). Is there any other
>>> way
>>> to do this other than no longer using embedded solr and programmatically
>>> firing up a web application (e.g. using jetty)? Should I do the latter
>>> anyway?
>>> 
>>> Any insights/advice greatly appreciated.
>>> 
>>> Best regards,
>>> 
>>> Robert
>>> 
>>> 
>> 
> 
> 
> -- 
> Robert Krüger
> Managing Partner
> Lesspain GmbH & Co. KG
> 
> www.lesspain-software.com



Replicating external field files under Windows

2015-01-15 Thread Rafał Kuć
Hello!

I have a slight problem with the replication and maybe someone have
the same experience and know if there is a solution. I have a Windows
based Solr installation where I use external field type - two fields,
two external files containing data. The deployment is a standard
master - slave. The problem is the replication. As we know,
replication can copy the index files and the configuration files. On
Linux system I used to use relative paths to the conf/ dir and I was
able to copy the external fields as well. On Windows I can't, even
though the paths seems to be OK.

Anyone has similar experience with Windows, replication and external
field type files?

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch



Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Ahmet Arslan
Hi Robert,

Never used by myself but is solr-packager useful in your case?

http://sourcesense.github.io/solr-packager/ 

Ahmet


On Thursday, January 15, 2015 4:45 PM, Robert Krüger  
wrote:
Hi Andrea,

you are assuming correctly. It is a local, non-distributed index that is
only accessed by the containing desktop application. Do you know if there
is a possibility to run the Solr admin UI on top of an embedded instance
somehow?

Thanks a lot,

Robert

On Thu, Jan 15, 2015 at 3:17 PM, Andrea Gazzarini 
wrote:

> Hi Robert,
> I've used the EmbeddedSolrServer in a scenario like that and I never had
> problems.
> I assume you're talking about a standalone application, where the whole
> index resides locally and you don't need any cluster / cloud / distributed
> feature.
>
> I think the usage of EmbeddedSolrServer is discouraged in a (distributed)
> service scenario, because it is a direct connection to a SolrCore
> instance...but this is not a problem in the situation you described (as far
> as I know)
>
> Best,
> Andrea
>
>
> On 01/15/2015 03:10 PM, Robert Krüger wrote:
>
>> Hi,
>>
>> I have been using an embedded instance of solr in my desktop application
>> for a long time and it works fine. At the time when I made that decision
>> (vs. firing up a solr web application within my swing application) I got
>> the impression embedded use is somewhat unsupported and I should expect
>> problems.
>>
>> My first question is, is this still the case now (4 years later), that
>> embedded solr is discouraged?
>>
>> The one limitation I am running into is that I cannot use the solr admin
>> UI
>> for debugging purposes (mainly for running queries). Is there any other
>> way
>> to do this other than no longer using embedded solr and programmatically
>> firing up a web application (e.g. using jetty)? Should I do the latter
>> anyway?
>>
>> Any insights/advice greatly appreciated.
>>
>> Best regards,

>>
>> Robert
>>
>>
>


-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Robert Krüger
Hi Andrea,

you are assuming correctly. It is a local, non-distributed index that is
only accessed by the containing desktop application. Do you know if there
is a possibility to run the Solr admin UI on top of an embedded instance
somehow?

Thanks a lot,

Robert

On Thu, Jan 15, 2015 at 3:17 PM, Andrea Gazzarini 
wrote:

> Hi Robert,
> I've used the EmbeddedSolrServer in a scenario like that and I never had
> problems.
> I assume you're talking about a standalone application, where the whole
> index resides locally and you don't need any cluster / cloud / distributed
> feature.
>
> I think the usage of EmbeddedSolrServer is discouraged in a (distributed)
> service scenario, because it is a direct connection to a SolrCore
> instance...but this is not a problem in the situation you described (as far
> as I know)
>
> Best,
> Andrea
>
>
> On 01/15/2015 03:10 PM, Robert Krüger wrote:
>
>> Hi,
>>
>> I have been using an embedded instance of solr in my desktop application
>> for a long time and it works fine. At the time when I made that decision
>> (vs. firing up a solr web application within my swing application) I got
>> the impression embedded use is somewhat unsupported and I should expect
>> problems.
>>
>> My first question is, is this still the case now (4 years later), that
>> embedded solr is discouraged?
>>
>> The one limitation I am running into is that I cannot use the solr admin
>> UI
>> for debugging purposes (mainly for running queries). Is there any other
>> way
>> to do this other than no longer using embedded solr and programmatically
>> firing up a web application (e.g. using jetty)? Should I do the latter
>> anyway?
>>
>> Any insights/advice greatly appreciated.
>>
>> Best regards,
>>
>> Robert
>>
>>
>


-- 
Robert Krüger
Managing Partner
Lesspain GmbH & Co. KG

www.lesspain-software.com


Re: Solr groups not matching with terms in a field

2015-01-15 Thread Ahmet Arslan


Hi Naresh,

Everything looks correct, what is the problem here?

If you want to see more than one document per group, there is a parameter for 
that which defaults to 1.

Ahmet



On Thursday, January 15, 2015 9:02 AM, Naresh Yadav  
wrote:
Hi all,

I had done following configuration to test Solr grouping concept.

solr version :  4.6.1 (tried in latest version 4.10.3 also)
Schema : http://www.imagesup.net/?di=10142124357616
Solrj code to insert docs :http://www.imagesup.net/?di=10142124381116
Response Group's :  http://www.imagesup.net/?di=1114212438351
Response Terms' : http://www.imagesup.net/?di=614212438580

Please let me know if am i doing something wrong here.


Re: Core deletion

2015-01-15 Thread phiroc
I duplicated an exist core, deleted the data directory and core.properties, 
updated solrconfig.xml and schema.xml and loaded the new core in SOLR's Admin 
Panel.

The logs contain a few 'index locked' errors:


solr.log:INFO  - 2015-01-15 14:43:09.492; 
org.apache.solr.core.CorePropertiesLocator; Found core inytapdf0 in 
/archives/solr/example/solr/inytapdf0/
solr.log:INFO  - 2015-01-15 14:49:17.685; 
org.apache.solr.core.CorePropertiesLocator; Found core inytapdf0 in 
/archives/solr/example/solr/inytapdf0/
solr.log.1:INFO  - 2015-01-05 18:08:13.253; 
org.apache.solr.core.CorePropertiesLocator; Found core inytapdf0 in 
/archives/solr/example/solr/inytapdf0/
solr.log.1:ERROR - 2015-01-05 18:08:17.467; org.apache.solr.core.CoreContainer; 
Error creating core [inytapdf0]: Index locked for write for core inytapdf0
solr.log.1:org.apache.solr.common.SolrException: Index locked for write for 
core inytapdf0
solr.log.1:Caused by: org.apache.lucene.store.LockObtainFailedException: Index 
locked for write for core inytapdf0
solr.log.1:INFO  - 2015-01-06 09:19:32.125; 
org.apache.solr.core.CorePropertiesLocator; Found core inytapdf0 in 
/archives/solr/example/solr/inytapdf0/
solr.log.1:ERROR - 2015-01-06 09:19:35.305; org.apache.solr.core.CoreContainer; 
Error creating core [inytapdf0]: Index locked for write for core inytapdf0
solr.log.1:org.apache.solr.common.SolrException: Index locked for write for 
core inytapdf0
solr.log.1:Caused by: org.apache.lucene.store.LockObtainFailedException: Index 
locked for write for core inytapdf0


Philippe




- Mail original -
De: "Dominique Bejean" 
À: solr-user@lucene.apache.org
Envoyé: Jeudi 15 Janvier 2015 11:46:43
Objet: Re: Core deletion

Hi,

Is there something in solr logs at startup that can explain the deletion ?

How were created the cores ? using cores API ?

Dominique
http://www.eolya.fr


2015-01-14 17:43 GMT+01:00 :

>
>
> Hello,
>
> I am running SOLR 4.10.0 on Tomcat 8.
>
> The solr.xml file in
> .../apache-tomcat-8.0.15_solr_8983/conf/Catalina/localhost looks like this:
>
>
> 
>  crossContext="true">
>  value="/archives/solr/example/solr" override="true"/>
> 
>
> My SOLR instance contains four cores, including one whose instanceDir and
> dataDir have the following values:
>
>
> instanceDir:/archives/solr/example/solr/indexapdf0/
> dataDir:/archives/indexpdf0/data/
>
> Strangely enough, every time I restart Tomcat, this core's data, [and only
> this core's data,] get deleted, which is pretty annoying.
>
> How can I prevent it?
>
> Many thanks.
>
> Philippe
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Easiest way to embed solr in a desktop application

2015-01-15 Thread Andrea Gazzarini

Hi Robert,
I've used the EmbeddedSolrServer in a scenario like that and I never had 
problems.
I assume you're talking about a standalone application, where the whole 
index resides locally and you don't need any cluster / cloud / 
distributed feature.


I think the usage of EmbeddedSolrServer is discouraged in a 
(distributed) service scenario, because it is a direct connection to a 
SolrCore instance...but this is not a problem in the situation you 
described (as far as I know)


Best,
Andrea

On 01/15/2015 03:10 PM, Robert Krüger wrote:

Hi,

I have been using an embedded instance of solr in my desktop application
for a long time and it works fine. At the time when I made that decision
(vs. firing up a solr web application within my swing application) I got
the impression embedded use is somewhat unsupported and I should expect
problems.

My first question is, is this still the case now (4 years later), that
embedded solr is discouraged?

The one limitation I am running into is that I cannot use the solr admin UI
for debugging purposes (mainly for running queries). Is there any other way
to do this other than no longer using embedded solr and programmatically
firing up a web application (e.g. using jetty)? Should I do the latter
anyway?

Any insights/advice greatly appreciated.

Best regards,

Robert





Easiest way to embed solr in a desktop application

2015-01-15 Thread Robert Krüger
Hi,

I have been using an embedded instance of solr in my desktop application
for a long time and it works fine. At the time when I made that decision
(vs. firing up a solr web application within my swing application) I got
the impression embedded use is somewhat unsupported and I should expect
problems.

My first question is, is this still the case now (4 years later), that
embedded solr is discouraged?

The one limitation I am running into is that I cannot use the solr admin UI
for debugging purposes (mainly for running queries). Is there any other way
to do this other than no longer using embedded solr and programmatically
firing up a web application (e.g. using jetty)? Should I do the latter
anyway?

Any insights/advice greatly appreciated.

Best regards,

Robert


groups inside groups

2015-01-15 Thread Dmitry Kan
Hi solr users & developers,

Is it possible to group inside the group? First level is a group query, the
second level is a single value field of each document inside the first
level group with counts.

I.e. the trick is, that the second level should contain facet counts on
values of that single value field.

To illustrate:

First level: query:

Field1:[* TO *] AND Field2:[* TO *]

group.field = UserId

Result:
'grouped'=>{
'UserId'=>{
  'matches'=>22154,
  'groups'=>[]},
'Field1:[* TO *] AND Field2:[* TO *]'=>{
  'matches'=>22154,
  'doclist'=>{'numFound'=>1282,'start'=>0,'docs'=>[
  {
'UserId'=>255},
  {
'UserId'=>3042},
  {
'UserId'=>3428},
  {
'UserId'=>255},
  {
'UserId'=>3042},
  {
'UserId'=>3042},
  {
'UserId'=>68},
  {
'UserId'=>68},
  {
'UserId'=>68},
  {
'UserId'=>68}]
 }}}


Desired output:

'grouped'=>{
'UserId'=>{
  'matches'=>22154,
  'groups'=>[]},
'Field1:[* TO *] AND Field2:[* TO *]'=>{
  'matches'=>22154,
  'doclist'=>{'numFound'=>1282,'start'=>0,'docs'=>[
  {
'255'=>2},
  {
'3042'=>3},
  {
'3428'=>1},
  {
'68'=>4}]
 }}}



-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: Core deletion

2015-01-15 Thread Dominique Bejean
Hi,

Is there something in solr logs at startup that can explain the deletion ?

How were created the cores ? using cores API ?

Dominique
http://www.eolya.fr


2015-01-14 17:43 GMT+01:00 :

>
>
> Hello,
>
> I am running SOLR 4.10.0 on Tomcat 8.
>
> The solr.xml file in
> .../apache-tomcat-8.0.15_solr_8983/conf/Catalina/localhost looks like this:
>
>
> 
>  crossContext="true">
>  value="/archives/solr/example/solr" override="true"/>
> 
>
> My SOLR instance contains four cores, including one whose instanceDir and
> dataDir have the following values:
>
>
> instanceDir:/archives/solr/example/solr/indexapdf0/
> dataDir:/archives/indexpdf0/data/
>
> Strangely enough, every time I restart Tomcat, this core's data, [and only
> this core's data,] get deleted, which is pretty annoying.
>
> How can I prevent it?
>
> Many thanks.
>
> Philippe
>
>
>
>
>
>
>
>
>
>
>
>
>


dynamic field value in ValueSource

2015-01-15 Thread Mathijs Corten

Hello,

At this moment i'm writing my own SOLR plugin with a custom 
ValueSourceParser and ValueSource, it's goal is to read a few fields 
from the document (For now).


I'm currently testing with the following fields:

multiValued="false" />
stored="true" multiValued="false" />


The problem is, when i try to read the dynamic field with the following 
code it does not return the correct value (it prints '`', should be 6):


public FunctionValues getValues(Map map, final AtomicReaderContext 
arc) throws IOException {
final BinaryDocValues valueDynamic = 
FieldCache.DEFAULT.getTerms(arc.reader(), "price_discount_1390980_6", 
false);

return new FunctionValues() {

@Override
public float floatVal(int doc) {
final BytesRef dynamic = valueDynamic.get(doc);
LOGGER.error(dynamic.utf8ToString()); // prints '`'
LOGGER.error(dynamic.toString()); // prints [60 8 0 0 0 6]
}
}
}

When i execute the query: price_discount_1390980_6:6, it gives a result 
so i know the value of the field should be 6.


When i try to read a normal field like active or roomids it works fine:
public FunctionValues getValues(Map map, final AtomicReaderContext 
arc) throws IOException {
final BinaryDocValues valueActive = 
FieldCache.DEFAULT.getTerms(arc.reader(), "active", false);
final BinaryDocValues valueActive = 
FieldCache.DEFAULT.getTerms(arc.reader(), "roomids", false);

return new FunctionValues() {

@Override
public float floatVal(int doc) {
final BytesRef active = valueActive.get(doc);
LOGGER.error(active.utf8ToString());
}
}
}

So my question is:
Does anyone know how to get/fix the dynamic field value? would be nice 
if it would just return 6 :P.


Hope to see an answer soon,

Mathijs


Re: OutOfMemoryError for PDF document upload into Solr

2015-01-15 Thread Siegfried Goeschl

Hi Ganesh,

you can increase the heap size but parsing a 4 GB PDF document will very 
likely consume A LOT OF memory - I think you need to check if that large 
PDF can be parsed at all :-)


Cheers,

Siegfried Goeschl

On 14.01.15 18:04, Michael Della Bitta wrote:

Yep, you'll have to increase the heap size for your Tomcat container.

http://stackoverflow.com/questions/6897476/tomcat-7-how-to-set-initial-heap-size-correctly

Michael Della Bitta

Senior Software Engineer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions  | g+:
plus.google.com/appinions

w: appinions.com 

On Wed, Jan 14, 2015 at 12:00 PM,  wrote:


Hello,

Can someone pass on the hints to get around following error? Is there any
Heap Size parameter I can set in Tomcat or in Solr webApp that gets
deployed in Solr?

I am running Solr webapp inside Tomcat on my local machine which has RAM
of 12 GB. I have PDF document which is 4 GB max in size that needs to be
loaded into Solr




Exception in thread "http-apr-8983-exec-6" java.lang.: Java heap space
 at java.util.AbstractCollection.toArray(Unknown Source)
 at java.util.ArrayList.(Unknown Source)
 at
org.apache.pdfbox.cos.COSDocument.getObjects(COSDocument.java:518)
 at org.apache.pdfbox.cos.COSDocument.close(COSDocument.java:575)
 at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:254)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1238)
 at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1203)
 at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:111)
 at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
 at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
 at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
 at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
 at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:246)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
 at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
 at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
 at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070)
 at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
 at
org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2462)
 at
org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2451)


Thanks
Ganesh