Searching words with spaces for word without spaces in solr

2014-05-28 Thread sunshine glass
Dear Team,

How can I handle compound word searches in solr ?.
How can i search hand bag if I have handbag in my index. While using
shingle in query analyzer, the query ice cube creates three tokens as
ice,cube, icecube. Only ice and cubes are searched but not
icecubes.i.e not working for pair though I am using shingle filter.

Here's the schema config.


   1.  fieldType name=text class=solr.TextField
   positionIncrementGap=100
   2.   analyzer type=index
   3. filter class=solr.SynonymFilterFactory
   synonyms=synonyms_text_prime_index.txt ignoreCase=true expand=true/
   4. charFilter class=solr.HTMLStripCharFilterFactory/
   5. tokenizer class=solr.StandardTokenizerFactory/
   6.  filter class=solr.ShingleFilterFactory maxShingleSize=2
   outputUnigrams=true tokenSeparator=/
   7.  filter class=solr.WordDelimiterFilterFactory
   catenateWords=1 catenateNumbers=1 catenateAll=1 preserveOriginal=1
   generateWordParts=1 generateNumberParts=1/
   8. filter class=solr.LowerCaseFilterFactory/
   9. filter class=solr.SnowballPorterFilterFactory
   language=English protected=protwords.txt/
   10.   /analyzer
   11.   analyzer type=query
   12. tokenizer class=solr.StandardTokenizerFactory/
   13. filter class=solr.SynonymFilterFactory
   synonyms=synonyms.txt ignoreCase=true expand=true/
   14. filter class=solr.ShingleFilterFactory maxShingleSize=2
   outputUnigrams=true tokenSeparator=/
   15. filter class=solr.WordDelimiterFilterFactory
   preserveOriginal=1/
   16. filter class=solr.LowerCaseFilterFactory/
   17. filter class=solr.SnowballPorterFilterFactory
   language=English protected=protwords.txt/
   18.   /analyzer
   19. /fieldType

   Any help is appreciated.


Re: Solr shut down by itself

2014-05-28 Thread rachun
HI Alex,
Thank you very much for your suggestion.
I found some OOM problem before it shut down. Now I will try to fix that
problem.

Best,
Chun.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-shut-down-by-itself-tp4138233p4138378.html
Sent from the Solr - User mailing list archive at Nabble.com.


aliasing for Stats component

2014-05-28 Thread Mohit Jain
Hi,

In a solr request one can specify aliasing for returned fields using
key:fl_name in fl param. I was looking at stats component and found
that similar support is not available. I do not want to expose internal
field names to external world. The plan is to do it in fl fashion instead
of post-processing the response at external layer.

I was wondering if exclusion of this feature is by choice or it's just that
it was not added till now.

Thanks
Mohit


Offline Indexes Update to Shard

2014-05-28 Thread Vineet Mishra
Hi All,

Has anyone tried with building Offline indexes with EmbeddedSolrServer and
posting it to Shards.
FYI, I am done building the indexes but looking out for a way to post these
index files on shards.
Copying the indexes manually to each shard's replica is possible and is
working fine but I don't want to go with that approach.

Thanks!


Re: How to Create a weighted function (dismax or otherwise)

2014-05-28 Thread rulinma
good.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-Create-a-weighted-function-dismax-or-otherwise-tp3119977p4138401.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Error enquiry- exceeded limit of maxWarmingSearchers=2

2014-05-28 Thread M, Arjun (NSN - IN/Bangalore)
Hi,

Also is there a way to check if autowarming completed (or) how to make 
the next commit wait till previous commit finishes?

Thanks  Regards,
Arjun M


-Original Message-
From: ext Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Wednesday, May 28, 2014 10:31 AM
To: solr-user@lucene.apache.org
Subject: Re: Error enquiry- exceeded limit of maxWarmingSearchers=2

 Hi,

 I am getting the below error.

 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 Error opening new searcher. exceeded limit of
 maxWarmingSearchers=2, try again later.

This error is usually a symptom of a problem, not the actual problem.

Either you are running into performance issues that are making your
commits slow, or you are committing too frequently.   Either way, you've
got a situation where one commit (with opensearcher=true) is not able to
finish before the next commit starts.

Solr puts a limit on the number of searcher objects that can be starting
up (warming) at the same time. You've exceeded that limit. Here's a wiki
page about slow commits:

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits

The rest of that wiki page discusses other things that can cause Solr
performance issues.

Thanks,
Shawn






Re: How to Create a weighted function (dismax or otherwise)

2014-05-28 Thread rulinma
good



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-Create-a-weighted-function-dismax-or-otherwise-tp3119977p4138411.html
Sent from the Solr - User mailing list archive at Nabble.com.


(Issue) How improve solr group performance

2014-05-28 Thread Alice.H.Yang (mis.cnsh04.Newegg) 41493
Hi, all
Does anybody has some advice for me on solr group performance. I have 
no idea on the group performance.

To David Smiley
I am not responsible for endeca, It's a pity ,I have no comment on 
endeca.

Best Regards,
Alice Yang
+86-021-51530666*41493
Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)

-邮件原件-
发件人: david.w.smi...@gmail.com [mailto:david.w.smi...@gmail.com] 
发送时间: 2014年5月27日 21:29
收件人: solr-user@lucene.apache.org
主题: Re: 答复: (Issue) How improve solr facet performance

Alice,

RE grouping, try Solr 4.8’s new “collapse” qparser w/ “expand
SearchComponent.  The ref guide has the docs.  It’s usually a faster equivalent 
approach to group=true

Do you care to comment further on NewEgg’s apparent switch from Endeca to Solr? 
 (confirm true/false and rationale)

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer 
http://www.linkedin.com/in/davidwsmiley


On Tue, May 27, 2014 at 4:17 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493  
alice.h.y...@newegg.com wrote:

 Hi, Token

 1.
 I set the 3 fields with hundreds of values uses fc and the 
 rest uses enum, the performance is improved 2 times compared with no 
 parameter, and then I add facet.method=20 , the performance is 
 improved about 4 times compared with no parameter.
 And I also tried setting 9 facet field to one copyfield, I 
 test the performance, it is improved about 2.5 times compared with no 
 parameter.
 So, It is improved a lot under your advice, thanks a lot.
 2.
 Now I have another performance issue, It's the group performance.
 The number of data is as same as facet performance scenario.
 When the keyword search hits about one million documents, the QTime is 
 about 600ms.(It doesn't query the first time, it's in cache)

 Query url:

 select?fl=item_catalogq=default_search:paramterdefType=edismaxrows=
 50group=truegroup.field=item_group_idgroup.ngroups=truegroup.sort=
 stock4sort%20desc,final_price%20asc,is_selleritem%20ascsort=score%20d
 esc,default_sort%20desc

 It need Qtime about 600ms.

 This query have two parameter:
 1. fl one field
 2. group=true, 
 group.ngroups=true

 If I set group=false,, the QTime is only 1 ms.
 But I need do group and group.ngroups, How can I improve the group 
 performance under this demand. Do you have some advice for me. I'm 
 looking forward to your reply.

 Best Regards,
 Alice Yang
 +86-021-51530666*41493
 Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)


 -邮件原件-
 发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 发送时间: 2014年5月24日 15:17
 收件人: solr-user@lucene.apache.org
 主题: RE: (Issue) How improve solr facet performance

 Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote:
  1.  I'm sorry, I have made a mistake, the total number of documents 
  is
 32 Million, not 320 Million.
  2.  The system memory is large for solr index, OS total has 256G, I 
  set
 the solr tomcat HEAPSIZE=-Xms25G -Xmx100G

 100G is a very high number. What special requirements dictates such a 
 large heap size?

  Reply:  9 fields I facet on.

 Solr treats each facet separately and with facet.method=fc and 10M 
 hits, this means that it will iterate 9*10M = 90M document IDs and 
 update the counters for those.

  Reply:  3 facet fields have one hundred unique values, other 6 facet
 fields' unique values are between 3 to 15.

 So very low cardinality. This is confirmed by your low response time 
 of 6ms for 2925 hits.

  And we test this scenario:  If the number of facet fields' unique 
  values
 is less we add facet.method=enum, there is a little to improve performance.

 That is a shame: enum is normally the simple answer to a setup like yours.
 Have you tried fine-tuning your fc/enum selection, so that the 3 
 fields with hundreds of values uses fc and the rest uses enum? That 
 might halve your response time.


 Since the number of unique facets is so low, I do not think that 
 DocValues can help you here. Besides the fine-grained 
 fc/enum-selection above, you could try collapsing all 9 facet-fields 
 into a single field. The idea behind this is that for facet.method=fc, 
 performing faceting on a field with (for example) 300 unique values 
 takes practically the same amount of time as faceting on a field with 
 1000 unique values: Faceting on a single slightly larger field is much faster 
 than faceting on 9 smaller fields.
 After faceting with facet.limit=-1 on the single super-facet-field, 
 you must match the returned values back to their original fields:


 If you have the facet-fields

 field0: 34
 field1: 187
 field2: 78432
 field3: 3
 ...

 then collapse them by or-ing a field-specific mask that is bigger than 
 the max in any field, then put it all into a single field:

 fieldAll: 0xA000 | 34
 fieldAll: 0xA100 | 187
 fieldAll: 0xA200 | 78432
 fieldAll: 0xA300 | 3
 ...

 perform the 

Re: Grouping performance problem

2014-05-28 Thread arres
Hello there, 
I am faceing the same problem. 
Did anyone found a solution yet?
Thank you,
arres



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Grouping-performance-problem-tp3995245p4138419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: (Issue) How improve solr group performance

2014-05-28 Thread Joel Bernstein
Alice,

How many unique groups are there in the field that you are grouping on?

When testing out the CollapsingQParserPlugin, take a look a the nullPolicy
option. If you'r working with a product catalog, there is often a scenario
where some products belong to a group and some don't. For products that
don't have a group you can place a null in the group field and use the
expand nullPolicy, which will place each null group record in it's own
group. Using the nullPolicy like this will be much more memory efficient
then placing a fake group id in the grouping field.





Joel Bernstein
Search Engineer at Heliosearch


On Wed, May 28, 2014 at 6:42 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493 
alice.h.y...@newegg.com wrote:

 Hi, all
 Does anybody has some advice for me on solr group performance. I
 have no idea on the group performance.

 To David Smiley
 I am not responsible for endeca, It's a pity ,I have no comment on
 endeca.

 Best Regards,
 Alice Yang
 +86-021-51530666*41493
 Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)

 -邮件原件-
 发件人: david.w.smi...@gmail.com [mailto:david.w.smi...@gmail.com]
 发送时间: 2014年5月27日 21:29
 收件人: solr-user@lucene.apache.org
 主题: Re: 答复: (Issue) How improve solr facet performance

 Alice,

 RE grouping, try Solr 4.8’s new “collapse” qparser w/ “expand
 SearchComponent.  The ref guide has the docs.  It’s usually a faster
 equivalent approach to group=true

 Do you care to comment further on NewEgg’s apparent switch from Endeca to
 Solr?  (confirm true/false and rationale)

 ~ David Smiley
 Freelance Apache Lucene/Solr Search Consultant/Developer
 http://www.linkedin.com/in/davidwsmiley


 On Tue, May 27, 2014 at 4:17 AM, Alice.H.Yang (mis.cnsh04.Newegg) 41493 
 alice.h.y...@newegg.com wrote:

  Hi, Token
 
  1.
  I set the 3 fields with hundreds of values uses fc and the
  rest uses enum, the performance is improved 2 times compared with no
  parameter, and then I add facet.method=20 , the performance is
  improved about 4 times compared with no parameter.
  And I also tried setting 9 facet field to one copyfield, I
  test the performance, it is improved about 2.5 times compared with no
 parameter.
  So, It is improved a lot under your advice, thanks a lot.
  2.
  Now I have another performance issue, It's the group performance.
  The number of data is as same as facet performance scenario.
  When the keyword search hits about one million documents, the QTime is
  about 600ms.(It doesn't query the first time, it's in cache)
 
  Query url:
 
  select?fl=item_catalogq=default_search:paramterdefType=edismaxrows=
  50group=truegroup.field=item_group_idgroup.ngroups=truegroup.sort=
  stock4sort%20desc,final_price%20asc,is_selleritem%20ascsort=score%20d
  esc,default_sort%20desc
 
  It need Qtime about 600ms.
 
  This query have two parameter:
  1. fl one field
  2. group=true,
  group.ngroups=true
 
  If I set group=false,, the QTime is only 1 ms.
  But I need do group and group.ngroups, How can I improve the group
  performance under this demand. Do you have some advice for me. I'm
  looking forward to your reply.
 
  Best Regards,
  Alice Yang
  +86-021-51530666*41493
  Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)
 
 
  -邮件原件-
  发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
  发送时间: 2014年5月24日 15:17
  收件人: solr-user@lucene.apache.org
  主题: RE: (Issue) How improve solr facet performance
 
  Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote:
   1.  I'm sorry, I have made a mistake, the total number of documents
   is
  32 Million, not 320 Million.
   2.  The system memory is large for solr index, OS total has 256G, I
   set
  the solr tomcat HEAPSIZE=-Xms25G -Xmx100G
 
  100G is a very high number. What special requirements dictates such a
  large heap size?
 
   Reply:  9 fields I facet on.
 
  Solr treats each facet separately and with facet.method=fc and 10M
  hits, this means that it will iterate 9*10M = 90M document IDs and
  update the counters for those.
 
   Reply:  3 facet fields have one hundred unique values, other 6 facet
  fields' unique values are between 3 to 15.
 
  So very low cardinality. This is confirmed by your low response time
  of 6ms for 2925 hits.
 
   And we test this scenario:  If the number of facet fields' unique
   values
  is less we add facet.method=enum, there is a little to improve
 performance.
 
  That is a shame: enum is normally the simple answer to a setup like
 yours.
  Have you tried fine-tuning your fc/enum selection, so that the 3
  fields with hundreds of values uses fc and the rest uses enum? That
  might halve your response time.
 
 
  Since the number of unique facets is so low, I do not think that
  DocValues can help you here. Besides the fine-grained
  fc/enum-selection above, you could try collapsing all 9 facet-fields
  into a 

Re: ExtractingRequestHandler indexing zip files

2014-05-28 Thread marotosg
I extended ExtractingDocumentLoader with this patch and it works.
https://issues.apache.org/jira/secure/attachment/12473188/SOLR-2416_ExtractingDocumentLoader.patch

Iterates throw all documents and extracts the name and the content of all
documents inside the file.

Regards,
Sergio 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4138427.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Error enquiry- exceeded limit of maxWarmingSearchers=2

2014-05-28 Thread Shawn Heisey
On 5/28/2014 3:45 AM, M, Arjun (NSN - IN/Bangalore) wrote:
   Also is there a way to check if autowarming completed (or) how to make 
 the next commit wait till previous commit finishes?

With Solr, probably not.  There might be a statistic available from an
admin handler that I don't know about, but as far as I know, your code
must be aware of approximately how long a commit is likely to take, and
not send another commit until you can be sure that the previous commit
is done.  This includes the commitWithin parameter on an update request.

Now that I've just said that, you *can* do an all documents query with
rows=0 and look for a change in numFound.  An update might actually
result in no change to numFound, so you would need to build in a
time-based exit to the loop that looks for numFound changes.

In the case of commits done automatically by the configuration
(autoCommit and/or autoSoftCommit), there is definitely no way to detect
when a previous commit is done.

The general recommendation with Solr 4.x is to have autoCommit enabled
with openSearcher=false, with a relatively short maxTime -- from 5
minutes down to 15 seconds, depending on indexing rate.  These commits
will not open a new searcher, and they will not make new documents visible.

For commits that affect which documents are visible, you need to
determine how long you can possibly stand to go without seeing new data
that has been indexed.  Once you know that time interval, you can use it
to do a manual commit, or you can set up autoSoftCommit with that
interval.  It is not at all unusual to have an autoCommit time interval
that's shorter than autoSoftCommit.

This blog post mentions SolrCloud, but it is also applicable to Solr 4.x
when NOT running in cloud mode:

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks,
Shawn



Re: aliasing for Stats component

2014-05-28 Thread Shalin Shekhar Mangar
Support for keys, tagging and excluding filters in StatsComponent was added
with SOLR-3177 in v4.8.0

You can specify e.g. stats.field={!key=xyz}id and the output will use xyz
instead of id.


On Wed, May 28, 2014 at 1:55 PM, Mohit Jain mo...@bloomreach.com wrote:

 Hi,

 In a solr request one can specify aliasing for returned fields using
 key:fl_name in fl param. I was looking at stats component and found
 that similar support is not available. I do not want to expose internal
 field names to external world. The plan is to do it in fl fashion instead
 of post-processing the response at external layer.

 I was wondering if exclusion of this feature is by choice or it's just that
 it was not added till now.

 Thanks
 Mohit




-- 
Regards,
Shalin Shekhar Mangar.


How sorting works with multiple shards?

2014-05-28 Thread Amey Patil
Hi,

I wanted to know how sorting works with multiple shards.

Suppose I have queried with 4 shards specified. Records per page specified
as 100  sort-field as creationDate. So will it sort  fetch 100 documents
from each shard, and then they will be aggregated, sorted again  top 100
will be given as a result discarding remaining 300?

My use case is -

I want to fetch documents with doc-id say A (or B or C etc.) and category W
X Y Z. Solr shards are created based on field category, so all the
documents with category W are in shard-W, all the documents with type X are
in shard-X and so on...

1st approach - query will be
(doc-id:A AND category:(W OR X)) OR (doc-id:B AND category:(W OR Y)) OR
  (doc-id:C AND category:(W OR X OR Y OR Z)) sorted on creationDate
Hit the query on all the shards.

2nd approach - there will be multiple queries
category:W AND (doc-id:(A OR B OR C))... sorted on creationDate. Hit this
query on shard-W
category:X AND (doc-id:(A OR C))... sorted on creationDate. Hit this query
on shard-X
category:Y AND (doc-id:(B OR C))... sorted on creationDate. Hit this query
on shard-Y
category:Z AND (doc-id:(C))... sorted on creationDate. Hit this query on
shard-Z
So there will be 4 queries, but avoiding the sort on aggregation.

I am using solr 3.4

Which approach will be efficient? My assumption about the working of
sorting in solr shards, is it correct?

Thanks,
Amey


Re: Regex with local params is not working

2014-05-28 Thread Jack Krupansky
Post the parsed query itself. Yes, edismax should always generate a 
disjunctionmaxquery - in addition to the regexquery.


-- Jack Krupansky

-Original Message- 
From: Lokn

Sent: Wednesday, May 28, 2014 1:53 AM
To: solr-user@lucene.apache.org
Subject: Re: Regex with local params is not working

I tried with debug on, solr generates disjunctionmaxquery instead of
regexquery.
Is there something that I need to add to the query?

Thanks,
Lokesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Regex-with-local-params-is-not-working-tp4138257p4138373.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Solrj Block Join Bean support

2014-05-28 Thread Archie Sheran
Hi, I would like to use a Solrj Bean to save/query nested documents from
Solr. Is this possible? I see that since version 4.5 it is possible to
use *addChildDocument
file:///C:/Users/archibald.sheran/Desktop/solr-4.8.1/docs/solr-solrj/org/apache/solr/common/SolrInputDocument.html#addChildDocument(org.apache.solr.common.SolrInputDocument)*
(SolrInputDocumentfile:///C:/Users/archibald.sheran/Desktop/solr-4.8.1/docs/solr-solrj/org/apache/solr/common/SolrInputDocument.html
 child but looking at the source code for
org.apache.solr.client.solrj.beans.DocumentObjectBinder
I see that nested documents are not taken into consideration. A new
annotation additional to the @Field annotation would also be required I
guess. Does anyone have any experience with this?

Regards,

Archie.


Email Notification for Sucess/Failure of Import Process.

2014-05-28 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi I am using the XML file for Indexing In SOLR. I am planning to make this 
process more automation. Creating XML File and Loading to SOLR.

I like to get email once the process is completed. Is there any way in solr can 
this achieved, I am not seeing more inputs on configure notification in SOLR.

Also I am trying DIH, using MS SQL , Someone can help me on sharing the 
data-config.xml if you are using already once for MSSQL with few basic steps.

Thanks

Ravi


Re: Regex with local params is not working

2014-05-28 Thread Yonik Seeley
On Wed, May 28, 2014 at 1:41 AM, Lokn nlokesh...@gmail.com wrote:
 Thanks for the reply.
 I am using edismax for the query parsing. Still it's not working.
 Instead of using local params, if I use the field directly then regex is
 working fine.

It's not for me...

This does not work:
http://localhost:8983/solr/query?defType=edismaxq=/[A-Z]olr/debugQuery=true

But this does work:
http://localhost:8983/solr/query?defType=luceneq=/[A-Z]olr/debugQuery=true

edismax was developed before the lucene query parser syntax was
changed to include regex, so maybe that's the issue.
Not that / was a great character to use for regex... it's too widely
used in URLs, paths, etc.  I'd almost argue against following lucene
syntax in this case and enabling regex a different way.


-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filtersfieldcache


Contribute QParserPlugin

2014-05-28 Thread Pawel Rog
Hi,
I need QParserPlugin that will use Redis as a backend to prepare filter
queries. There are several data structures available in Redis (hash, set,
etc.). From some reasons I cannot fetch data from redis data structures,
build and send big requests from application. That's why I want to build
that filters on backend (Solr) side.

I'm wondering what do I have to do to contribute QParserPlugin into Solr
repository. Can you suggest me a way (in a few steps) to publish it in Solr
repository, probably as a contrib?

--
Paweł Róg


Re: Email Notification for Sucess/Failure of Import Process.

2014-05-28 Thread Stefan Matheis
How about using DIH’s EventListeners? 
http://wiki.apache.org/solr/DataImportHandler#EventListeners  

-Stefan  


On Wednesday, May 28, 2014 at 5:31 PM, EXTERNAL Taminidi Ravi (ETI, 
Automotive-Service-Solutions) wrote:

 Hi I am using the XML file for Indexing In SOLR. I am planning to make this 
 process more automation. Creating XML File and Loading to SOLR.
  
 I like to get email once the process is completed. Is there any way in solr 
 can this achieved, I am not seeing more inputs on configure notification in 
 SOLR.
  
 Also I am trying DIH, using MS SQL , Someone can help me on sharing the 
 data-config.xml if you are using already once for MSSQL with few basic steps.
  
 Thanks
  
 Ravi  



Solr Cell Tika - date.formats

2014-05-28 Thread ienjreny
Hello everybody

How can we pass more than value for date.formats parameter into the URL

Is comma seperated? Or we ca just define it into solrconfig.xml?
Example:
http://server:port/solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z',-MM-dd'T'HH:mm:ss

Thanks in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cell-Tika-date-formats-tp4138478.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Contribute QParserPlugin

2014-05-28 Thread Alan Woodward
Hi Pawel,

The easiest thing to do is to open a JIRA ticket on the Solr project, here: 
https://issues.apache.org/jira/browse/SOLR, and attach your patch.

Alan Woodward
www.flax.co.uk


On 28 May 2014, at 16:50, Pawel Rog wrote:

 Hi,
 I need QParserPlugin that will use Redis as a backend to prepare filter
 queries. There are several data structures available in Redis (hash, set,
 etc.). From some reasons I cannot fetch data from redis data structures,
 build and send big requests from application. That's why I want to build
 that filters on backend (Solr) side.
 
 I'm wondering what do I have to do to contribute QParserPlugin into Solr
 repository. Can you suggest me a way (in a few steps) to publish it in Solr
 repository, probably as a contrib?
 
 --
 Paweł Róg



Re: Solrj Block Join Bean support

2014-05-28 Thread Mikhail Khludnev
Hello,

Never heard about it. Please raise an issue. I don't mean I have a plan to
address it soon. It just makes sense to bookmark it.


On Wed, May 28, 2014 at 5:57 PM, Archie Sheran archie.she...@gmail.comwrote:

 Hi, I would like to use a Solrj Bean to save/query nested documents from
 Solr. Is this possible? I see that since version 4.5 it is possible to
 use *addChildDocument

 file:///C:/Users/archibald.sheran/Desktop/solr-4.8.1/docs/solr-solrj/org/apache/solr/common/SolrInputDocument.html#addChildDocument(org.apache.solr.common.SolrInputDocument)*

 (SolrInputDocumentfile:///C:/Users/archibald.sheran/Desktop/solr-4.8.1/docs/solr-solrj/org/apache/solr/common/SolrInputDocument.html
  child but looking at the source code for
 org.apache.solr.client.solrj.beans.DocumentObjectBinder
 I see that nested documents are not taken into consideration. A new
 annotation additional to the @Field annotation would also be required I
 guess. Does anyone have any experience with this?

 Regards,

 Archie.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: How sorting works with multiple shards?

2014-05-28 Thread Shalin Shekhar Mangar
Your understanding of the sorting mechanism with many shards is almost
right. In reality, Solr doesn't fetch the entire document from each shard.
Instead, it fetches just the uniqueKey and the sort field's value and then
merges them to get the top N and then fetches the actual doc content for
those docs from the respective shards.

If you do not need to merge results from shards ever then it may be faster
to go the 2nd approach but if you do want merged results from all shards
then Solr can do this faster than you and you should use approach #1.

As always, it is best to benchmark yourself.


On Wed, May 28, 2014 at 7:19 PM, Amey Patil amey.pa...@germin8.com wrote:

 Hi,

 I wanted to know how sorting works with multiple shards.

 Suppose I have queried with 4 shards specified. Records per page specified
 as 100  sort-field as creationDate. So will it sort  fetch 100 documents
 from each shard, and then they will be aggregated, sorted again  top 100
 will be given as a result discarding remaining 300?

 My use case is -

 I want to fetch documents with doc-id say A (or B or C etc.) and category W
 X Y Z. Solr shards are created based on field category, so all the
 documents with category W are in shard-W, all the documents with type X are
 in shard-X and so on...

 1st approach - query will be
 (doc-id:A AND category:(W OR X)) OR (doc-id:B AND category:(W OR Y)) OR
   (doc-id:C AND category:(W OR X OR Y OR Z)) sorted on creationDate
 Hit the query on all the shards.

 2nd approach - there will be multiple queries
 category:W AND (doc-id:(A OR B OR C))... sorted on creationDate. Hit this
 query on shard-W
 category:X AND (doc-id:(A OR C))... sorted on creationDate. Hit this query
 on shard-X
 category:Y AND (doc-id:(B OR C))... sorted on creationDate. Hit this query
 on shard-Y
 category:Z AND (doc-id:(C))... sorted on creationDate. Hit this query on
 shard-Z
 So there will be 4 queries, but avoiding the sort on aggregation.

 I am using solr 3.4

 Which approach will be efficient? My assumption about the working of
 sorting in solr shards, is it correct?

 Thanks,
 Amey




-- 
Regards,
Shalin Shekhar Mangar.


Re: Contribute QParserPlugin

2014-05-28 Thread Otis Gospodnetic
Hi,

I think the question is not really how to do it - that's clear -
http://wiki.apache.org/solr/HowToContribute

The question is really about whether something like this would be of
interest to Solr community, whether it is likely it would be accepted into
Solr core or contrib, or whether, perhaps because of potentially unwanted
dependency on Redis, Solr dev community might not want this in Solr and
this might be better done outside Solr.

Not sure what the answer is. maybe active Solr developers can chime in
here?  Or maybe dev list is a better place to ask?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, May 28, 2014 at 2:03 PM, Alan Woodward a...@flax.co.uk wrote:

 Hi Pawel,

 The easiest thing to do is to open a JIRA ticket on the Solr project,
 here: https://issues.apache.org/jira/browse/SOLR, and attach your patch.

 Alan Woodward
 www.flax.co.uk


 On 28 May 2014, at 16:50, Pawel Rog wrote:

  Hi,
  I need QParserPlugin that will use Redis as a backend to prepare filter
  queries. There are several data structures available in Redis (hash, set,
  etc.). From some reasons I cannot fetch data from redis data structures,
  build and send big requests from application. That's why I want to build
  that filters on backend (Solr) side.
 
  I'm wondering what do I have to do to contribute QParserPlugin into Solr
  repository. Can you suggest me a way (in a few steps) to publish it in
 Solr
  repository, probably as a contrib?
 
  --
  Paweł Róg




Re: Solr Cell Tika - date.formats

2014-05-28 Thread Jack Krupansky

Pass multiple instances of the date.formats parameter:

http://server:port/solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z'date.formats=-MM-dd'T'HH:mm:ss

But as the doc says, it comes preconfigured with all these formats:

-MM-dd'T'HH:mm:ss'Z'
-MM-dd'T'HH:mm:ss
-MM-dd
-MM-dd hh:mm:ss
-MM-dd HH:mm:ss
EEE MMM d hh:mm:ss z 
EEE, dd MMM  HH:mm:ss zzz
, dd-MMM-yy HH:mm:ss zzz
EEE MMM d HH:mm:ss 

See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika

-- Jack Krupansky

-Original Message- 
From: ienjreny

Sent: Wednesday, May 28, 2014 1:00 PM
To: solr-user@lucene.apache.org
Subject: Solr Cell Tika - date.formats

Hello everybody

How can we pass more than value for date.formats parameter into the URL

Is comma seperated? Or we ca just define it into solrconfig.xml?
Example:
http://server:port/solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z',-MM-dd'T'HH:mm:ss

Thanks in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cell-Tika-date-formats-tp4138478.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Percolator feature

2014-05-28 Thread Jorge Luis Betancourt Gonzalez
Is there some work around in Solr ecosystem to get something similar to the 
percolator feature offered by elastic search? 

Greetings!VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 
de julio de 2014. Ver www.uci.cu


Re: Grouping on a multi-valued field

2014-05-28 Thread Erick Erickson
what  would grouping on a multivalued field mean? Count the same doc
separately for each value in the MV field? Use the first value?

This seems similar to the problem of sorting on fields with more than
one token, any approach I can think of will be wrong.

But this smells like an XY problem. What is the use-case you're trying
to support? If you tell us that there might be alternate approaches
that would work OOB.

To answer your question not that I know of.

Best
Erick

On Mon, May 26, 2014 at 11:28 PM, Bhoomit Vasani bhoomit.2...@gmail.com wrote:
  Hi,

 Does latest release of solr supports grouping on a multi-valued field?

 According to this
 https://wiki.apache.org/solr/FieldCollapsing#Known_Limitations it doesn't,
 but the doc was last updated 14 months ago...

 --
 --
 Thanks  Regards,
 Bhoomit Vasani | SE @ Mygola
 WE are LIVE http://www.mygola.com/!
 91-8892949849


Re: Applying boosting for keyword search

2014-05-28 Thread Erick Erickson
The issue is absolute ordering (sort) and influencing (boosting).

Here's an example

 score
 no boosts popularity
doc1   100 1
doc2 75 2
doc3 10 3

Sorting by popularity   asc will return doc1, doc2, doc3
Sorting by popularity desc will return doc3, doc2, doc1

It doesn't matter at all what the score is. When sorting by popularity
ascending will sort in this order if the score of doc3 is 10,000 and
the score of doc1 is 100. sorting totally overrides ranking.


Boosting, on the other hand, only changes order if you sort by score
(which is the default, ranking). So sorting by score desc would
return doc1, doc2, doc3.

Now, say you boost the docs such that you add 50 to the score for
doc2. The returned order would be doc2, doc1, doc3.

The deal here is that boosting changes the _score_, but doesn't impose
an absolute ordering. Sorting by the value in a field imposes an
absolute, unchanging ordering.


Best,
Erick


On Tue, May 27, 2014 at 12:30 AM, manju16832003 manju16832...@gmail.com wrote:
 Hi Erick,

 Your explanation leads me to one question :-)

 if
 */select?q=featured:true^100fq=make:toyotasort=featured_date desc,price
 asc*

 The above query, without edismax, works well because, If I'm not mistaken
 its boosting document by value method.

 So I'm boosting all my documents with the value featured=true and all those
 documents would be sorted by their featured date in descending order (Latest
 featured documents) and price (lower to higher).

 My question is,
 If we were to boost the documents based on a value, how could we make sure
 the order of the documents?

 For example :
 https://wiki.apache.org/solr/SolrRelevancyFAQ
 defType=dismaxqf=textq=supervilliansbf=popularity

 In the above case, all the documents that contains the word *popularity*
 would be on top depends on their score.

 However, I want to order the documents by certain criteria that contains the
 word popularity So we would have to use *sort* to order the documents.

 if we say, boosting has no or almost no effect if we use sort, then whats
 the contradiction story between *sort* and *boost*

 :-) would be interesting to know the answer



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Applying-boosting-for-keyword-search-tp4137523p4138241.html
 Sent from the Solr - User mailing list archive at Nabble.com.


edge_ngram and short words containing digits

2014-05-28 Thread Kevin Murphy
Hi,

i’m using Django Haystack 2.1.0 with Solr 4.8.1 in an auto-complete 
application.  I’ve noticed that words containing digits are not being matched.  
Examples are ‘B2B’, ‘PSG4’, and ‘5S_rRNA’.  The words match up to the 
occurrence of the digit and fail starting with the digit.

Below is what I believe to be the relevant chunk from the Haystack-generated 
Solr schema.xml.  If I need to include more, let me know.

COPB2

fieldType name=edge_ngram class=solr.TextField 
positionIncrementGap=1
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 
splitOnCaseChange=1/
filter class=solr.EdgeNGramFilterFactory minGramSize=2 
maxGramSize=15 side=front /
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 
splitOnCaseChange=1/
  /analyzer
/fieldType

Can I get this to work by tweaking the WordDelimiterFilterFactory attributes 
somehow, or do I need to do something else?

Thanks,
Kevin



Re: Percolator feature

2014-05-28 Thread Otis Gospodnetic
Yes - Luwak.  Stay tuned for more. :)

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, May 28, 2014 at 4:44 PM, Jorge Luis Betancourt Gonzalez 
jlbetanco...@uci.cu wrote:

 Is there some work around in Solr ecosystem to get something similar to
 the percolator feature offered by elastic search?

 Greetings!VII Escuela Internacional de Verano en la UCI del 30 de junio al
 11 de julio de 2014. Ver www.uci.cu



Transfer Existing Index to Core with Clean Index

2014-05-28 Thread ScottFree
Hello Solr Community!

I'm very new at Solr, and I have an issue I can't get around. I did a
'full-import' command on an existing solr server, but forgot to specify
'clean=false', so the entire index ended up getting deleted.

Luckily for me, I had previously made a copy of the index in case something
of this nature would happen.

How am I able to get the server to re-recognize its old index?

Thanks you very much! This whole thing has been very stressful :P



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Transfer-Existing-Index-to-Core-with-Clean-Index-tp4138530.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: edge_ngram and short words containing digits

2014-05-28 Thread Kevin Murphy
On May 28, 2014, at 6:19 PM, Kevin Murphy murft...@gmail.com wrote:
 i’m using Django Haystack 2.1.0 with Solr 4.8.1 in an auto-complete 
 application.  I’ve noticed that words containing digits are not being 
 matched.  Examples are ‘B2B’, ‘PSG4’, and ‘5S_rRNA’.  The words match up to 
 the occurrence of the digit and fail starting with the digit.

I solved the problem by adding `splitOnNumerics=“0”` to the 
solr.WordDelimiterFilterFactory filter for both the index and query analyzers.  
I don’t know if there is a potential downside to this.

Regards,
Kevin



Problem with French stopword filter

2014-05-28 Thread Shamik Bandopadhyay
Hi,

  I'm having issues with French stop filter factory. Search doesn't work
when I use a stop word in a phrase search. For e.g. if I search arc de
cercle , Solr doesn't return any result. It however works if I use arc
cercle. Here's my schema setting :

field name=title_fra type=adsktext_fra indexed=true stored=true
multiValued=true/
field name=name_fra type=adsktext_fra indexed=true stored=true/
field name=description_fra type=adsktext_fra indexed=true
stored=true/


fieldType name=adsktext_fra class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=lang/stopwords_fr.txt format=snowball /
filter class=solr.ElisionFilterFactory ignoreCase=true
articles=lang/contractions_fr.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=lang/stopwords_fr.txt format=snowball /
filter class=solr.ElisionFilterFactory ignoreCase=true
articles=lang/contractions_fr.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory/
/analyzer
/fieldType

Sample data
==

doc
  field name=id11!SOLR11091/field
  field name=name_fraFast server Enterprise/field
  field name=title_fraARC (commande)/field
   field name=description_fraCrée un arc de cercle. Trouver Résumé Pour
créer un arc, vous pouvez également indiquer des combinaisons comprenant le
centre, l'extrémité, le point de départ, le rayon, l'angle, la longueur de
corde et la direction. Les arcs sont tracés par défaut dans le sens
trigonométrique. Maintenez la touche Ctrl enfoncée tout en déplaçant le
curseur afin d'effectuer le tracé dans le sens des aiguilles d'une montre.
Liste des invites Les invites suivantes s'affichent. Point de départ
Dessine un arc à partir des trois points spécifiés sur la circonférence de
l'arc. Le premier point est le point de départ (1). Remarque : Si vous
appuyez sur ENTREE sans spécifier de point, l'extrémité de la dernière
ligne ou du dernier arc tracé est utilisée, et vous êtes immédiatement
invité à spécifier l'extrémité du nouvel arc. Un arc tangent à la dernière
ligne, à la dernière polyligne ou au dernier arc tracé est ainsi créé.
Second point Permet de spécifier le deuxième (2) point comme point sur la
circonférence de l'arc. Extrémité Permet de spécifier le point final (3)
sur l'arc. Vous pouvez spécifier un arc à trois points dans le sens horaire
ou trigonométrique. Centre Commence par spécifier le centre du cercle dont
l'arc fait partie. Point de départ Permet de spécifier le point de départ
de l'arc. Extrémité A partir du centre (2), dessine un arc dans le sens
trigonométrique entre le point de départ (1) et un point d'arrivée situé
sur une demi-droite imaginaire tracée entre le centre et le troisième point
(3). Comme le montre l'illustration ci-contre, l'arc ne passe pas
nécessairement par ce troisième point. Angle Dessine un arc dans le sens
trigonométrique à partir du point de départ (1), en utilisant un centre
(2), avec un angle décrit spécifié. Si l'ange est négatif, un arc est tracé
dans le sens horaire. Longueur de corde Trace un grand ou un petit arc en
respectant la distance en ligne droite entre le point de départ et le point
d'arrivée. Si la longueur de corde est positive, le petit arc est tracé
dans le sens trigonométrique à partir du point de départ. Si la longueur de
corde est négative, le grand arc est tracé dans le sens trigonométrique.
Fin Commence par spécifier l'extrémité de l'arc. Centre Trace un arc dans
le sens trigonométrique depuis le point de départ (1) jusqu'à un point
final situé sur une demi-droite imaginaire obtenue en partant du centre (3)
vers le point que vous indiquez (2). Angle Dessine un arc dans le sens
trigonométrique du point de départ (1) au point d'arrivée (2), avec un
angle décrit spécifié. Si l'ange est négatif, un arc est tracé dans le sens
horaire. Angle décrit Entrez un angle en degrés ou spécifiez un angle en
déplaçant le périphérique de pointage dans le sens trigonométrique.
Direction Commence à dessiner l'arc tangent à la direction spécifiée. Cette
option permet de dessiner un arc, grand ou petit, dans le sens horaire ou
trigonométrique, entre un point de départ (1) et une extrémité (2). La
direction est déterminée à partir du point de départ. Rayon Dessine le
petit arc dans le sens trigonométrique entre le point de départ (1) et
l'extrémité (2). Si le rayon est négatif, le grand arc est tracé. Centre
Spécifie le 

Re: Contribute QParserPlugin

2014-05-28 Thread Alexandre Rafalovitch
Well, Solr just bundled a set of Hadoop jars that does not actually
contribute anything to Solr itself (not really integrated, etc). So, I
am not sure how the may not want process happened there. Would be
nice to have one actually, because there is a slow building wave of
external components for Solr which are completely not discoverable by
the Solr community at large.

So, I would love us to (re-?)start the serious discussion on the
plugin model for Solr. Probably on the dev list.

I would even commit to building an initial package discovery/search
website if the dev-list powers would agree on how that mechanism
(package/plugins/downloads) should look like. ElasticSearch is very
obviously benefiting from having a plugin system. Solr's kitchen-sync
approach worked when it was the only one. But with increased speed of
releases and the growing packages, it is becoming very noticeably
pudgy. It even had to be excused during the Solr vs. ElasticSearch
presentation at the BerlinBuzz a couple of days ago.

Regards,
   Alex.
P.s. Regarding the specific issue, I know of another Redis plugin. Not
sure how relevant or useful it is, but at least it exists:
https://github.com/dfdeshom/solr-redis-cache

Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, May 29, 2014 at 2:50 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 Hi,

 I think the question is not really how to do it - that's clear -
 http://wiki.apache.org/solr/HowToContribute

 The question is really about whether something like this would be of
 interest to Solr community, whether it is likely it would be accepted into
 Solr core or contrib, or whether, perhaps because of potentially unwanted
 dependency on Redis, Solr dev community might not want this in Solr and
 this might be better done outside Solr.

 Not sure what the answer is. maybe active Solr developers can chime in
 here?  Or maybe dev list is a better place to ask?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Wed, May 28, 2014 at 2:03 PM, Alan Woodward a...@flax.co.uk wrote:

 Hi Pawel,

 The easiest thing to do is to open a JIRA ticket on the Solr project,
 here: https://issues.apache.org/jira/browse/SOLR, and attach your patch.

 Alan Woodward
 www.flax.co.uk


 On 28 May 2014, at 16:50, Pawel Rog wrote:

  Hi,
  I need QParserPlugin that will use Redis as a backend to prepare filter
  queries. There are several data structures available in Redis (hash, set,
  etc.). From some reasons I cannot fetch data from redis data structures,
  build and send big requests from application. That's why I want to build
  that filters on backend (Solr) side.
 
  I'm wondering what do I have to do to contribute QParserPlugin into Solr
  repository. Can you suggest me a way (in a few steps) to publish it in
 Solr
  repository, probably as a contrib?
 
  --
  Paweł Róg




Re: Contribute QParserPlugin

2014-05-28 Thread Otis Gospodnetic
Hi,

On Wed, May 28, 2014 at 10:58 PM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Well, Solr just bundled a set of Hadoop jars that does not actually
 contribute anything to Solr itself (not really integrated, etc). So, I


Good point about Hadoop jars.


 am not sure how the may not want process happened there. Would be
 nice to have one actually, because there is a slow building wave of
 external components for Solr which are completely not discoverable by
 the Solr community at large.


Agreed and a Wiki page where people can add this or Google don't cut
it? (serious question)


 So, I would love us to (re-?)start the serious discussion on the
 plugin model for Solr. Probably on the dev list.


Sure.  Separate thread?

I would even commit to building an initial package discovery/search
 website if the dev-list powers would agree on how that mechanism
 (package/plugins/downloads) should look like. ElasticSearch is very
 obviously benefiting from having a plugin system. Solr's kitchen-sync
 approach worked when it was the only one. But with increased speed of
 releases and the growing packages, it is becoming very noticeably
 pudgy. It even had to be excused during the Solr vs. ElasticSearch
 presentation at the BerlinBuzz a couple of days ago.


For the curious - Alex is referring to
http://blog.sematext.com/2014/05/28/presentation-and-video-side-by-side-with-solr-and-elasticsearch/

Re building something - may be best to talk about that in that separate
thread.


 P.s. Regarding the specific issue, I know of another Redis plugin. Not
 sure how relevant or useful it is, but at least it exists:
 https://github.com/dfdeshom/solr-redis-cache


Thanks.  It's different from what Pawel was asking about.  Maybe Pawel can
provide a couple of examples so people can better understand what he is
looking to do.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/





 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Thu, May 29, 2014 at 2:50 AM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
  Hi,
 
  I think the question is not really how to do it - that's clear -
  http://wiki.apache.org/solr/HowToContribute
 
  The question is really about whether something like this would be of
  interest to Solr community, whether it is likely it would be accepted
 into
  Solr core or contrib, or whether, perhaps because of potentially unwanted
  dependency on Redis, Solr dev community might not want this in Solr and
  this might be better done outside Solr.
 
  Not sure what the answer is. maybe active Solr developers can chime
 in
  here?  Or maybe dev list is a better place to ask?
 
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
  On Wed, May 28, 2014 at 2:03 PM, Alan Woodward a...@flax.co.uk wrote:
 
  Hi Pawel,
 
  The easiest thing to do is to open a JIRA ticket on the Solr project,
  here: https://issues.apache.org/jira/browse/SOLR, and attach your
 patch.
 
  Alan Woodward
  www.flax.co.uk
 
 
  On 28 May 2014, at 16:50, Pawel Rog wrote:
 
   Hi,
   I need QParserPlugin that will use Redis as a backend to prepare
 filter
   queries. There are several data structures available in Redis (hash,
 set,
   etc.). From some reasons I cannot fetch data from redis data
 structures,
   build and send big requests from application. That's why I want to
 build
   that filters on backend (Solr) side.
  
   I'm wondering what do I have to do to contribute QParserPlugin into
 Solr
   repository. Can you suggest me a way (in a few steps) to publish it in
  Solr
   repository, probably as a contrib?
  
   --
   Paweł Róg
 
 



Re: Problem with French stopword filter

2014-05-28 Thread shamik
Turned out to be a weird exception. Apparently, the comments in the
stopwords_fr.txt disrupts the stop filter factory. After I stripped off the
comments, it worked as expected. 

Referred to this thread :
http://mail-archives.apache.org/mod_mbox/lucene-dev/201309.mbox/%3CJIRA.12668581.1379112889603.133757.1379118831671@arcas%3E



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-French-stopword-filter-tp4138545p4138550.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Contribute QParserPlugin

2014-05-28 Thread Alexandre Rafalovitch
On Thu, May 29, 2014 at 10:25 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 am not sure how the may not want process happened there. Would be
 nice to have one actually, because there is a slow building wave of
 external components for Solr which are completely not discoverable by
 the Solr community at large.


 Agreed and a Wiki page where people can add this or Google don't cut
 it? (serious question)

It does not. Current Wiki  is extremely out of date, and offers no
incentives or automation around having the plugins published there.
Plus the Wiki itself is in a transition and no real decision has been
taken about either fate of current wiki or the proper public
contribution models to the new Guide. This needs to be structured,
discoverable and actionable. Not difficult, just structured in the
right way.

Google does not work mostly because Solr articles/modules/extensions
has moved from information availability and into the information
abundance with the significant need for curation.

Yes, something could be done by heroic efforts of one individual
(subscribe to my solr-start.com mailing list for an announcement in a
relevant area), but a proper package support is needed.

 So, I would love us to (re-?)start the serious discussion on the
 plugin model for Solr. Probably on the dev list.


 Sure.  Separate thread?
Done. On dev.

Regards,
   Alex.


Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


Re: Solr Cell Tika - date.formats

2014-05-28 Thread ienjreny
Thanks for your fast answer


On Wed, May 28, 2014 at 11:23 PM, Jack Krupansky-2 [via Lucene] 
ml-node+s472066n4138505...@n3.nabble.com wrote:

 Pass multiple instances of the date.formats parameter:

 a href=http://server:port
 /solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z'date.formats=-MM-dd'T'HH:mm:sshttp://server:port
 /solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z'date.formats=-MM-dd'T'HH:mm:ss

 But as the doc says, it comes preconfigured with all these formats:

 -MM-dd'T'HH:mm:ss'Z'
 -MM-dd'T'HH:mm:ss
 -MM-dd
 -MM-dd hh:mm:ss
 -MM-dd HH:mm:ss
 EEE MMM d hh:mm:ss z 
 EEE, dd MMM  HH:mm:ss zzz
 , dd-MMM-yy HH:mm:ss zzz
 EEE MMM d HH:mm:ss 

 See:

 https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika

 -- Jack Krupansky

 -Original Message-
 From: ienjreny
 Sent: Wednesday, May 28, 2014 1:00 PM
 To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4138505i=0
 Subject: Solr Cell Tika - date.formats

 Hello everybody

 How can we pass more than value for date.formats parameter into the URL

 Is comma seperated? Or we ca just define it into solrconfig.xml?
 Example:
 a href=http://server:port
 /solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z',-MM-dd'T'HH:mm:sshttp://server:port
 /solr/update/extract?date.formats=-MM-dd'T'HH:mm:ss'Z',-MM-dd'T'HH:mm:ss

 Thanks in advance



 --
 View this message in context:

 http://lucene.472066.n3.nabble.com/Solr-Cell-Tika-date-formats-tp4138478.html
 Sent from the Solr - User mailing list archive at Nabble.com.



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Solr-Cell-Tika-date-formats-tp4138478p4138505.html
  To unsubscribe from Solr Cell Tika - date.formats, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4138478code=aXNtYWVlbC5lbmpyZW55QGdtYWlsLmNvbXw0MTM4NDc4fC01NTkxMjYzODg=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Cell-Tika-date-formats-tp4138478p4138560.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with French stopword filter

2014-05-28 Thread shamik
I found the issue. It had to do with edismax qf entry in request handler. I
had the following entry :

str name=qfname_fra^1.2  title_fra^10.0 description_fra^5.0 
author^1/str

Except for author, all other fields are of type adsktext_fra, while author
was of the type text_general, which uses english stopfilter. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-French-stopword-filter-tp4138545p4138561.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr High GC issue

2014-05-28 Thread bihan.chandu
Hi All

I am Currently using solr 3.6.1 and my system handle lot of request .Now we
are facing High GC issue in system. Please find the memory parameters in my
solr system . Can some on help me to identify is there any relationship
between my memory parameters and GC issue.

MEM_ARGS=-Xms7936M -Xmx7936M -XX:NewSize=512M -XX:MaxNewSize=512M -Xss1024k
-XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSParallelRemarkEnabled -XX:+AggressiveOpts
-XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:MaxTenuringThreshold=15
-XX:-UseAdaptiveSizePolicy -XX:PermSize=256M -XX:MaxPermSize=256M
-XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGC
-Xloggc:${GCLOG} -XX:-OmitStackTraceInFastThrow -XX:+DisableExplicitGC
-XX:-BindGCTaskThreadsToCPUs -verbose:gc -XX:StackShadowPages=20

Thanks 
Bihan 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-High-GC-issue-tp4138570.html
Sent from the Solr - User mailing list archive at Nabble.com.