date:20150713

Using the term component to get Auto-suggest is a very old approach, and
gives minimal features…
If it is ok for you, ok!

I would suggest these reading for Auto suggestions :

Suggester Solr wiki
https://cwiki.apache.org/confluence/display/solr/Suggester
Solr suggester http://lucidworks.com/blog/solr-suggester/ ( Erick's post)
http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html ( my
post)

Hope they help!

Cheers


2015-07-13 11:51 GMT+01:00 ssharma7...@gmail.com ssharma7...@gmail.com:

 Hi,
 For my reply dated Jul 02, 2015; 4:47pm, Actually *there is no difference
 in results* for spellchecker  suggester components in Solr 4.6 and
 Solr
 5.1. I was actually mixing up the two components.


 Thanks  Regards,
 Sachin Vyas.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Suggester-configuration-queries-tp4214950p4217030.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: Restore index API does not work in solr 5.1.0 ?

2015-07-13 Thread dinesh naik

Hi all,
How can we restore index in Solr 5.1.0 ?

Best Regards,
Dinesh Naik

On Thu, Jul 9, 2015 at 6:54 PM, dinesh naik dineshkumarn...@gmail.com
wrote:

 Hi all,

 How can we restore the index in Solr 5.1.0 ?

 We did following:

 1:- Started Solr Cloud from:

 bin/solr start -e cloud -noprompt



 2:- posted some documents to solr from examples folder using :

 java -Dc=gettingstarted -jar post.jar *.xml



 3:- Backed up the Index using:

 http://localhost:8983/solr/gettingstarted/replication?command=backup



 4:- Deleted 1 document using:


 http://localhost:8983/solr/gettingstarted/update?stream.body=deletequeryid:IW-02/query/deletecommit=true



 5:- restored the index using:

 http://localhost:8983/solr/gettingstarted/replication?command=restore



 The Restore works fine with same steps for 5.2 versions but not 5.1

 Is there any other way to restore index in Solr 5.1.0?

 --
 Best Regards,
 Dinesh Naik




-- 
Best Regards,
Dinesh Naik

Re: Trouble getting a solr join query done

I was to comment the very same solution!
I think this will satisfy the user requirement.
Thanks Antonio!

Cheers

2015-07-13 12:22 GMT+01:00 Antonio David Pérez Morales 
adperezmora...@gmail.com:

 Hi again Yusnel

 Just to confirm, I have tested your use case and the query which returns
 what you need is this one:

 http://localhost:8983/solr/category/select?q={!join from=categoryId
 fromIndex=product to=id}*:*wt=jsonindent=truefq=name:clotheshl=false

 Please, check and let us know if it works for you

 Regards

 2015-07-12 17:02 GMT+02:00 Antonio David Pérez Morales 
 adperezmora...@gmail.com:

  Hi Yusnel
 
  I think the query is invalid. It should be q=clothesfq={!join
  from=type_id to=id fromIndex=products} or q=*:*fq={!join from=type_id
  to=id fromIndex=products}clothes as long as you are using an edismax
  parser or df param for default field, where clothes query is matched
 to.
 
  Regards
 
 
 
  2015-07-11 2:23 GMT+02:00 Yusnel Rojas García yroj...@gmail.com:
 
  I have 2 indexes
 
  products {
 id,
 name,
 type_id
 ..
  }
 
  and
 
  categories {
 id,
 name
 ..
  }
 
  and I want to get all categories that match a name and have products in
  it.
  my best guess would be:
 
 
 http://localhost:8983/solr/categories/select?q=clothesfl=*,scorefq={!join
  from=type_id
  
 http://localhost:8983/solr/categories/select?q=clothesfl=*,scorefq=%7B!joinfrom=type_id
 
  to=id fromIndex=products}*:*
 
  but always get an empty response. help please!
 
  Is a better way of doing that without using another index?
 
 
 




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: Suggester configuration queries.

2015-07-13 Thread ssharma7...@gmail.com

Hi, 
For my reply dated Jul 02, 2015; 4:47pm, for my scenario / test data, the
results of Spellchecker of Solr 4.6  5.1 are fine.
Also, the results of Suggester of Solr 4.6  5.1 are fine.

I was mixing up the two components.


Thanks  Regards, 
Sachin Vyas.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggester-configuration-queries-tp4214950p4217032.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Trouble getting a solr join query done

2015-07-13 Thread Antonio David Pérez Morales

Hi again Yusnel

Just to confirm, I have tested your use case and the query which returns
what you need is this one:

http://localhost:8983/solr/category/select?q={!join from=categoryId
fromIndex=product to=id}*:*wt=jsonindent=truefq=name:clotheshl=false

Please, check and let us know if it works for you

Regards

2015-07-12 17:02 GMT+02:00 Antonio David Pérez Morales 
adperezmora...@gmail.com:

 Hi Yusnel

 I think the query is invalid. It should be q=clothesfq={!join
 from=type_id to=id fromIndex=products} or q=*:*fq={!join from=type_id
 to=id fromIndex=products}clothes as long as you are using an edismax
 parser or df param for default field, where clothes query is matched to.

 Regards



 2015-07-11 2:23 GMT+02:00 Yusnel Rojas García yroj...@gmail.com:

 I have 2 indexes

 products {
id,
name,
type_id
..
 }

 and

 categories {
id,
name
..
 }

 and I want to get all categories that match a name and have products in
 it.
 my best guess would be:

 http://localhost:8983/solr/categories/select?q=clothesfl=*,scorefq={!join
 from=type_id
 http://localhost:8983/solr/categories/select?q=clothesfl=*,scorefq=%7B!joinfrom=type_id
 to=id fromIndex=products}*:*

 but always get an empty response. help please!

 Is a better way of doing that without using another index?

Re: Solr search in different servers based on search keyword

Hi Arijit,
let me clarify some points, ok?


2015-07-13 6:22 GMT+01:00 Arijit Saha arijitsaha...@gmail.com:

 Hi Solr/ lucene Experts,

 We are planning to build a solr/ lucene search application.

 As per design requirement, the files (on which search operation require to
 be done) will be lying in separate server.

Ok, so the datasources for you search engine, your source of information
will be files on different servers.
This is perfectly fine.
Lucene/Solr don't use the physical files you want to index for search.
You feed Solr with the Documents, which will be indexed, producing an
Inverted index and a set of inherent data structures to provide Search at
query time.
What you do really care is whether the index(es) related to your corpus of
Documents will be or not distributed across different nodes.



 We want to use Solr / lucene to perform search operation on files lying in
 different remote servers.


So, considering now the files to be index segments, the answer is yes.
Lucene can search between different indexes and Solr on top of it can as
well.
SolrCloud allow you to architect your Search Engine on a cluster of Solr
instances.
Each logic Collection can be partitioned in different shards ( partition of
the whole index)  and each shard can be replicated how much you want.
It is possible to implement you own routing strategy ( the way your docs go
into which shard), or use already available routing strategies.
Yo may be interested in the compositeId routing, which later applies to
your search requirement.

Take a look to those interesting docs :

https://lucidworks.com/blog/multi-level-composite-id-routing-solrcloud/
https://lucidworks.com/blog/solr-cloud-document-routing/

At indexing time you will be able to calculate the shard to send your
documents, and be able to have your documents co-located depending of a
specific key ( that can be the original server the doc is coming from)



 Do solr/ lucene support above feature of search in different servers based
 on search keyword


Now you can use at query time the same key you configured at Indexing time
and query only a subset of documents, based on their original location.


 I am newbie to Solr/ Lucene.

 Please help. Also, let know in case any additional details required.


Happy to help again and with better details :)

Cheers



 Much appreciated.

 Thanks,
 Arijit




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: Suggester configuration queries.

2015-07-13 Thread ssharma7...@gmail.com

Hi,
For my reply dated Jul 02, 2015; 4:47pm, Actually *there is no difference
in results* for spellchecker  suggester components in Solr 4.6 and Solr
5.1. I was actually mixing up the two components.


Thanks  Regards,
Sachin Vyas.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Suggester-configuration-queries-tp4214950p4217030.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copying data from one collection to another collection (solr cloud 521)

bq: does offline

No. I'm talking about collection aliasing. You can create an entirely
new collection, index to it however  you want then switch to using that
new collection.

bq: Any updates to EXISTING document in the LIVE collection should NOT be
replicated to the previous week(s) snapshot(s)

then give it a new ID maybe?

Best,
Erick

On Mon, Jul 13, 2015 at 3:21 PM, Raja Pothuganti
rpothuga...@competitrack.com wrote:
 Thank you Erick
Actually, my question is why do it this way at all? Why not index
directly to your live nodes? This is what SolrCloud is built for.
You an use implicit routing to create shards say, for each week and
age out the ones that are too old as well.


 Any updates to EXISTING document in the LIVE collection should NOT be
 replicated to the previous week(s) snapshot(s). Think of the snapshot(s)
 as an archive of sort and searchable independent of LIVE. We're aiming to
 support at most 2 archives of data in the past.


Another option would be to use collection aliasing to keep an
offline index up to date then switch over when necessary.

 Does offline indexing refers to this link
 https://github.com/cloudera/search/tree/0d47ff79d6ccc0129ffadcb50f9fe0b271f
 102aa/search-mr


 Thanks
 Raja



 On 7/13/15, 3:14 PM, Erick Erickson erickerick...@gmail.com wrote:

Actually, my question is why do it this way at all? Why not index
directly to your live nodes? This is what SolrCloud is built for.

There's the new backup/restore functionality that's still a work in
progress, see: https://issues.apache.org/jira/browse/SOLR-5750

You an use implicit routing to create shards say, for each week and
age out the ones that are too old as well.

Another option would be to use collection aliasing to keep an
offline index up to date then switch over when necessary.

I'd really like to know this isn't an XY problem though, what's the
high-level problem you're trying to solve?

Best,
Erick

On Mon, Jul 13, 2015 at 12:49 PM, Raja Pothuganti
rpothuga...@competitrack.com wrote:

 Hi,
 We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu
boxes. We currently ingest data into a large collection, call it LIVE.
After the full ingest is done we then trigger a delta delta ingestion
every 15 minutes to get the documents  data that have changed into this
LIVE instance.

 In Solr 4.X using a Master / Slave setup we had slaves that would
periodically (weekly, or monthly) refresh their data from the Master
rather than every 15 minutes. We're now trying to figure out how to get
this same type of setup using SolrCloud.

 Question(s):
 - Is there a way to copy data from one SolrCloud collection into
another quickly and easily?
 - Is there a way to programmatically control when a replica receives
it's data or possibly move it to another collection (without losing
data) that updates on a  different interval? It ideally would be another
collection name, call it Week1 ... Week52 ... to avoid a replica in the
same collection serving old data.

 One option we thought of was to create a backup and then restore that
into a new clean cloud. This has a lot of moving parts and isn't nearly
as neat as the Master / Slave controlled replication setup. It also has
the side effect of potentially taking a very long time to backup and
restore instead of just copying the indexes like the old M/S setup.

 Any ideas of thoughts? Thanks in advance for you help.
 Raja

Re: Persistence problem with swapped cores after Solr restart -- 4.9.1

Uggghh. Not persistence again

I'll stay tuned..

Erick

On Mon, Jul 13, 2015 at 2:44 PM, Shawn Heisey apa...@elyograg.org wrote:
 On Solr 4.9.1 with core discovery, I seem to be having trouble with core
 swaps not persisting through a full Solr restart.

 I apologize for the fact that this message is lean on details ... I've
 seen the problem twice now, but I don't have any concrete before/after
 information about what's in each core.properties file.  I am attempting
 to set up the scenario again and gather that information.

 The entire directory structure is set up as a git repo, so I will be
 able to tell if any files (like core.properties) are modified for the
 rebuild/swap that I have started.  The repo shows no changes at the
 moment, but I have done several of these rebuild/swap operations, so
 even if core.properties is being correctly updated, it might just have
 landed back on the original configuration.

 I have another copy of my index using Solr 4.7.2 with the old solr.xml
 format that seems to have no problems with core swapping and
 persistence.  That works differently, though -- all cores are defined in
 solr.xml rather than with core.properties files.

 When I first set up these Solr instances, I don't recall having this
 problem, but full Solr restarts are really rare, so it's possible I just
 didn't create the right circumstances.

 Thanks,
 Shawn

Re: Why I get a hit on %, , but not on !, @, #, $, ^, *

2015-07-13 Thread Jack Krupansky

Oops... that's the types attribute.

-- Jack Krupansky

On Mon, Jul 13, 2015 at 11:11 PM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 The word delimiter filter is remmoving special characters. You can add a
 file containing a list of the special characters that you wish to treat as
 alpha, using the type parameter.

 -- Jack Krupansky

 On Mon, Jul 13, 2015 at 6:43 PM, Steven White swhite4...@gmail.com
 wrote:

 Hi Everyone,

 I think the subject line said it all.  Here is the schema I'm using:

 fieldType name=my_text class=solr.TextField
 positionIncrementGap=100
 autoGeneratePhraseQueries=true
   analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=lang/stopwords_en.txt/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=1
 stemEnglishPossessive=1 preserveOriginal=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType

 I'm guessing this is due to how solr.WhitespaceTokenizerFactory works and
 those that it is not indexing are removed because they are considered
 white-spaces?  If so, how can I include %, , etc. into this
 none-indexed
 list?  I would rather see all these not indexed vs some are and some are
 not causing confusion to my users.

 Thanks

 Steve

Re: Running Solr 5.2.1 on WIndows using NSSM

Adrian,

Do you know if this script creates a config file somewhere? Would it be
possible/helpful to have a script in Solr's /bin to run it as a service?

e.g.:

bin\install_solr_service.cmd 

It would assume these defaults: 
   -nssm c:\Program Files\nssm\win64\nssm
   -servicename Solr
   -start true

The rest of the parameters would be the same as bin\solr.cmd

It would, behind the scenes, run:

nssm install Solr bin/solr.cmd -f %*
nssm set Solr AppDirectory .

And possibly:

nssm start Solr

I don't have a Windows setup to try this on right now, but I'd like to
see such a script inside the bin/ directory.

Would this work?

Upayavira

On Tue, Jul 14, 2015, at 02:53 AM, Adrian Liew wrote:
 Hi Edwin,
 
 Sorry for the late reply. Was caught up yesterday. 
 
 Yes I did not use the start.jar command and followed this article using
 solr.cmd -
 http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/. I am
 using a Windows Server 2012 R2 Server.
 
 The article example shows that it passes the start -f -p 8983 as
 arguments to the service. I believe it is important to have the -f. Did
 you try this example?
 
 If it didn't work for you, have you tried to remove the service via nssm
 and add it again? 
 
 Best regards,
 Adrian
 
 
 -Original Message-
 From: Zheng Lin Edwin Yeo [mailto:edwinye...@gmail.com] 
 Sent: Monday, July 13, 2015 10:51 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Running Solr 5.2.1 on WIndows using NSSM
 
 Hi Adrian,
 
 I got this to work for Solr 5.1, but when I tried this in Solr 5.2.1, it
 gives the error Windows could not start the solr5.2.1 service on Local
 Computer. The service did not return an error. This could be an internal
 Windows error or an internal service error.
 
 As Solr 5.2.1 is not using the start.jar command to run Solr, are we
 still able to use the same arguments to set up the nssm?
 
 Regards,
 Edwin
 
 
 On 8 July 2015 at 17:38, Adrian Liew adrian.l...@avanade.com wrote:
 
  Answered my own question. :) It seems to work great for me by 
  following this article.
 
  http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/
 
  Regards,
  Adrian
 
  -Original Message-
  From: Adrian Liew [mailto:adrian.l...@avanade.com]
  Sent: Wednesday, July 8, 2015 4:43 PM
  To: solr-user@lucene.apache.org
  Subject: Running Solr 5.2.1 on WIndows using NSSM
 
  Hi guys,
 
  I am looking to run Apache Solr v5.2.1 on a windows machine. I tried 
  to setup a windows service using NSSM (Non-Sucking-Service-Manager) to 
  install the windows service on the machine pointing to the solr.cmd 
  file path itself and installing the service.
 
  After installation, I tried to start the windows service but it gives 
  back an alert message. It says \Windows could not start the 
  SolrService service on Local Computer. The service  did not return an 
  error. This could be an internal Windows error or an internal service error.
 
  Most of the examples of older Apache Solr uses the java -start 
  start.jar command to run Solr and seem to run okay with nssm. I am not 
  sure if this could be the solr.cmd issue or NSSM's issue.
 
  Alternatively, I have tried to use Windows Task Scheduler to configure 
  a task to point to the solr.cmd as well and run task whenever the 
  computer starts (regardless a user is logged in or not). The task 
  scheduler seems to report back 'Task Start Failed' with Level of 'Error'.
 
  Additionally, after checking Event Viewer, it returns the error with 
  nssm Failed to open process handle for process with PID 3640 when 
  terminating service Solr Service : The parameter is incorrect.
 
  Chances this can point back to the solr.cmd file itself.
 
  Thoughts?
 
  Regards,
  Adrian

Re: Why I get a hit on %, , but not on !, @, #, $, ^, *

2015-07-13 Thread Jack Krupansky

The word delimiter filter is remmoving special characters. You can add a
file containing a list of the special characters that you wish to treat as
alpha, using the type parameter.

-- Jack Krupansky

On Mon, Jul 13, 2015 at 6:43 PM, Steven White swhite4...@gmail.com wrote:

 Hi Everyone,

 I think the subject line said it all.  Here is the schema I'm using:

 fieldType name=my_text class=solr.TextField positionIncrementGap=100
 autoGeneratePhraseQueries=true
   analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=lang/stopwords_en.txt/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=1
 stemEnglishPossessive=1 preserveOriginal=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType

 I'm guessing this is due to how solr.WhitespaceTokenizerFactory works and
 those that it is not indexing are removed because they are considered
 white-spaces?  If so, how can I include %, , etc. into this none-indexed
 list?  I would rather see all these not indexed vs some are and some are
 not causing confusion to my users.

 Thanks

 Steve

Field collapsing on parent document

2015-07-13 Thread StrW_dev

Hello,

I use a blockjoin document structure with 3 levels (base, path and
attributes). I am performing a facet query to count the number of different
attributes, but I would like to group or collapse them at path level.

I can easily collapse them on base (by using _root_), but I want them to be
grouped or collapsed at the intermediate level. Can I do that? 
So basically a query that combines a parent child query with a collapse
field query. Something like this: {!child of=type:path}{!collapse
field=_root_}

Gr



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-on-parent-document-tp4217053.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Lingo3g-Solr integration - ClassNotFoundException: com.google.common.base.MoreObjects

2015-07-13 Thread Mandris, Collin

Just a quick update,

The version of Lingo3G (1.12.0) does not seem compatible with the older version 
of Guava packaged with Solr. Switching to an older version of Lingo3G has 
resolved the issue.

Thanks for the help! 
Collin

Collin Mandris
Associate Engineer, Software
Defense Solutions Division
General Dynamics Information Technology
55 Dodge Road 
Buffalo, New York 14068-1205
Phone: 1-716-243-4022
Fax: (716) 691-3642
collin.mand...@gdit.com

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Friday, July 10, 2015 15:50
To: solr-user@lucene.apache.org
Subject: Re: Lingo3g-Solr integration - ClassNotFoundException: 
com.google.common.base.MoreObjects

On 7/10/2015 10:09 AM, Mandris, Collin wrote:
 Hello,

 I am trying to integrate Lingo3g with Solr. I have arrived at the error 
 ClassNotFoundException error using Lingo3g (verison 1.12.0) with Solr 4.8.0. 
 I located the guava-18.0.jar, which contains the 
 com.google.common.base.MoreObjects class, and have tried putting it in 
 multiple locations within our Solr deployment, but have had no luck in 
 getting by the error. So far, I have tried:

 1)Adding Class-Path: guava-18.0.jar to the manifest file in 
 start.jar, solr.war and lingo3g-1.12.0.jar, with guava-18.0.jar copied to the 
 same folder as each respective jar file.
 2)Putting guava-18.0.jar in the contrib\clustering\lib folder 
 with the other lingo3g jar files.
 3)Putting guava-18.0.jar in the java jdk bin folder.

Solr already includes guava, but it's a very old version -- 14.0.1.

This means that you can't simply add a newer guava jar... but I've just tried 
upgrading Guava in the solr source code to 18.0, and Solr won't compile.

We have an unresolved issue to upgrade guava to version 15.  Somebody mentioned 
kite-morphlines as a blocker for that, but I'm not sure what the full story is. 
 I've updated the issue with a comment about this thread.

https://issues.apache.org/jira/browse/SOLR-5584

At this time, you can't use anything that depends on Guava 18.  This is a 
textbook case of jar hell ... we need to get guava upgraded in Solr.

https://en.wikipedia.org/wiki/Java_Classloader#JAR_hell

Thanks,
Shawn

Multiple facet fields Query

2015-07-13 Thread Phansalkar, Ajeet

Hi,

If I want to add facet on multiple fields I am typically adding multiple 
facet.fields as part of the query.

facet=true  facet.field=field1  facet.field=field2

Is there another way to do this instead of using the facet.field multiple time 
but using only say facet.field=field1,field2.

I am running into issue integrating it with our esb layer because of the  
field.

Thanks,
Ajeet Phansalkar

Re: Multiple facet fields Query

2015-07-13 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)



On Mon, Jul 13, 2015, at 03:09 PM, Phansalkar, Ajeet wrote:
 Hi,
 
 If I want to add facet on multiple fields I am typically adding multiple
 facet.fields as part of the query.
 
 facet=true  facet.field=field1  facet.field=field2
 
 Is there another way to do this instead of using the facet.field multiple
 time but using only say facet.field=field1,field2.
 
 I am running into issue integrating it with our esb layer because of the
  field.

It is a common pattern within Solr to use multiple request parameters
with the same name. 

You may be able to get around it, if you are using the latest Solr,
using the JSON facet or the JSON query API, which encapsulate similar
functionality in a JSON snippet.

Upayavira

Re: Querying Nested documents

2015-07-13 Thread Mikhail Khludnev

what about?
http://localhost:8983/solr/demo/select?q={!parent%20which=%27type:parent%27}fl=*,[child%20parentFilter=type:parent%20childFilter=-type:parent]indent=true



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com

RE: Running Solr 5.2.1 on WIndows using NSSM

2015-07-13 Thread Adrian Liew

Hi Edwin,

Sorry for the late reply. Was caught up yesterday. 

Yes I did not use the start.jar command and followed this article using 
solr.cmd - http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/. I 
am using a Windows Server 2012 R2 Server.

The article example shows that it passes the start -f -p 8983 as arguments to 
the service. I believe it is important to have the -f. Did you try this example?

If it didn't work for you, have you tried to remove the service via nssm and 
add it again? 

Best regards,
Adrian


-Original Message-
From: Zheng Lin Edwin Yeo [mailto:edwinye...@gmail.com] 
Sent: Monday, July 13, 2015 10:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Running Solr 5.2.1 on WIndows using NSSM

Hi Adrian,

I got this to work for Solr 5.1, but when I tried this in Solr 5.2.1, it gives 
the error Windows could not start the solr5.2.1 service on Local Computer. The 
service did not return an error. This could be an internal Windows error or an 
internal service error.

As Solr 5.2.1 is not using the start.jar command to run Solr, are we still able 
to use the same arguments to set up the nssm?

Regards,
Edwin


On 8 July 2015 at 17:38, Adrian Liew adrian.l...@avanade.com wrote:

 Answered my own question. :) It seems to work great for me by 
 following this article.

 http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/

 Regards,
 Adrian

 -Original Message-
 From: Adrian Liew [mailto:adrian.l...@avanade.com]
 Sent: Wednesday, July 8, 2015 4:43 PM
 To: solr-user@lucene.apache.org
 Subject: Running Solr 5.2.1 on WIndows using NSSM

 Hi guys,

 I am looking to run Apache Solr v5.2.1 on a windows machine. I tried 
 to setup a windows service using NSSM (Non-Sucking-Service-Manager) to 
 install the windows service on the machine pointing to the solr.cmd 
 file path itself and installing the service.

 After installation, I tried to start the windows service but it gives 
 back an alert message. It says \Windows could not start the 
 SolrService service on Local Computer. The service  did not return an 
 error. This could be an internal Windows error or an internal service error.

 Most of the examples of older Apache Solr uses the java -start 
 start.jar command to run Solr and seem to run okay with nssm. I am not 
 sure if this could be the solr.cmd issue or NSSM's issue.

 Alternatively, I have tried to use Windows Task Scheduler to configure 
 a task to point to the solr.cmd as well and run task whenever the 
 computer starts (regardless a user is logged in or not). The task 
 scheduler seems to report back 'Task Start Failed' with Level of 'Error'.

 Additionally, after checking Event Viewer, it returns the error with 
 nssm Failed to open process handle for process with PID 3640 when 
 terminating service Solr Service : The parameter is incorrect.

 Chances this can point back to the solr.cmd file itself.

 Thoughts?

 Regards,
 Adrian

Re: XML File Size for Post.jar

2015-07-13 Thread Alexandre Rafalovitch

I don't think you can do files that big. The memory would blow out.
You sure you cannot chunk it into smaller document sets? Or make it a
streaming parsing with DIH in a pull fashion?

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 13 July 2015 at 14:56, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
external.ravi.tamin...@us.bosch.com wrote:

 HI, Where I have to change to support the xml file more than 2GB to Index in 
 Solr, using the simple post tool (post.jar) for Jetty and Tomcat.

 Thanks

 Ravi

RE: XML File Size for Post.jar

I Can break that into smaller files but for other case the number of files 
growing in 100s..

Can I Parse XML Files to DIH..?  Can you refer few examples..?

Thanks

Ravi

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: Monday, July 13, 2015 3:01 PM
To: solr-user
Subject: Re: XML File Size for Post.jar

I don't think you can do files that big. The memory would blow out.
You sure you cannot chunk it into smaller document sets? Or make it a streaming 
parsing with DIH in a pull fashion?

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 13 July 2015 at 14:56, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS) 
external.ravi.tamin...@us.bosch.com wrote:

 HI, Where I have to change to support the xml file more than 2GB to Index in 
 Solr, using the simple post tool (post.jar) for Jetty and Tomcat.

 Thanks

 Ravi

RE: Solr cloud error during document ingestion

2015-07-13 Thread Tarala, Magesh

Shawn,
Here are my responses:

 Is that the entire error, or is there additional error information?  Do
 you have any way to know exactly what is in that request that's throwing
 the error?
That's the entire error stack. Don’t see anything else in solr log. Probably 
need to turn on additional logging? 
I've identified the text in the email (.msg) that's causing it. This is it: 
(daños)
The tilde in the n is the culprit. If I remove this and run the load, it works 
fine. 

 You said 4.10.2 ... is this the Solr or SolrJ version?  Are both of them
 the same version?  Are you running Solr in the included jetty, or have
 you installed it into another servlet container?  What Java vendor and
 version are you running, and is it 64-bit?
Solr version is 4.10.2
Solrj version is 4.10.3
Using built in Jetty. 
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)

 Can you share your SolrJ code, solrconfig, schema, and any other
 information you can think of that might be relevant?
Yes, absolutely. Where would you like to see it posted?

 Because the error is from a request, I doubt that autoCommit has
 anything to do with the problem, but I could be wrong about that.
Yes, agree. This is not related to autoCommit.


-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Sunday, July 12, 2015 6:25 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr cloud error during document ingestion

On 7/11/2015 9:33 PM, Tarala, Magesh wrote:
 I'm using 4.10.2 in a 3 node solr cloud setup
 I have a collection with 3 shards and 2 replicas each.
 I'm ingesting solr documents via solrj.
 
 While ingesting the documents, I get the following error:
 
 264147944 [updateExecutor-1-thread-268] ERROR 
 org.apache.solr.update.StreamingSolrServers  ? error 
 org.apache.solr.common.SolrException: Bad Request
 
 request: 
 http://10.222.238.35:8983/solr/serviceorder_shard1_replica2/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2F10.222.238.36%3A8983%2Fsolr%2Fserviceorder_shard2_replica1%2Fwt=javabinversion=2
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
 
 I commit after every 100 documents in solrj.
 And I also have the following solrconfig.xml setting:
  autoCommit
maxTime${solr.autoCommit.maxTime:15000}/maxTime
openSearcherfalse/openSearcher
  /autoCommit

Is that the entire error, or is there additional error information?  Do
you have any way to know exactly what is in that request that's throwing
the error?

You said 4.10.2 ... is this the Solr or SolrJ version?  Are both of them
the same version?  Are you running Solr in the included jetty, or have
you installed it into another servlet container?  What Java vendor and
version are you running, and is it 64-bit?

Can you share your SolrJ code, solrconfig, schema, and any other
information you can think of that might be relevant?

Because the error is from a request, I doubt that autoCommit has
anything to do with the problem, but I could be wrong about that.

Thanks,
Shawn

RE: Multiple facet fields Query

2015-07-13 Thread Reitzel, Charles

Indeed, it is built into the HTML Forms specification that any query parameter 
may be repeated any number of times.   If your ESB tool didn't support this, it 
would be very broken.   My expectation is that it does and a bit more debugging 
and/or research into the product will yield results.   

Are you using POST but not setting the Content-Type: 
application/x-www-form-urlencoded?  Also, check that you are encoding using 
UTF-8 character set and have correctly escaped reserved characters.  Fwiw, 
SolrJ will do the right thing here.  So, if nothing else, what it puts on the 
wire can be used as a reference.

See http://www.w3.org/TR/html401/interact/forms.html

Every professional java/perl/C/C++/etc. URL implementation I have ever worked 
with supports multiple values per name encoded as name1=fooname1=bar... 
with a high degree of interoperability.

-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk] 
Sent: Monday, July 13, 2015 10:33 AM
To: solr-user@lucene.apache.org
Subject: Re: Multiple facet fields Query



On Mon, Jul 13, 2015, at 03:09 PM, Phansalkar, Ajeet wrote:
 Hi,
 
 If I want to add facet on multiple fields I am typically adding 
 multiple facet.fields as part of the query.
 
 facet=true  facet.field=field1  facet.field=field2
 
 Is there another way to do this instead of using the facet.field 
 multiple time but using only say facet.field=field1,field2.
 
 I am running into issue integrating it with our esb layer because of 
 the  field.

It is a common pattern within Solr to use multiple request parameters with the 
same name. 

You may be able to get around it, if you are using the latest Solr, using the 
JSON facet or the JSON query API, which encapsulate similar functionality in a 
JSON snippet.

Upayavira

*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*

XML File Size for Post.jar

2015-07-13 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)


HI, Where I have to change to support the xml file more than 2GB to Index in 
Solr, using the simple post tool (post.jar) for Jetty and Tomcat.

Thanks

Ravi

Re: Highlighting pre and post tags not working

You need to xml encode the tags. So instead of em, put lt;emgt;
and instead of /em put lt;/emgt;

Upayavira

On Mon, Jul 13, 2015, at 05:19 PM, Paden wrote:
 Hello,
 
 I'm trying to get some Solr highlighting going but I've run into a small
 problem. When I set the pre and post tags with my own custom tag I get an
 XML error
 
 XML Parsing Error: mismatched tag. Expected: /em.
 Location:
 file:///home/paden/Downloads/solr-5.1.0/server/solr/Testcore2/conf/solrconfig.xml
 Line Number 476, Column 40:   str name=hl.simple.preem/str
 
 I've seen it done like this on a lot of the other sites and I'm not sure
 if
 I'm missing an escape character or something. Just to emphasize that I
 did
 set a POST tag I put it right after the pre in solrconfig.xml like so
 
 str name=hl.simple.preem/str
 str name=hl.simple.post/em/str 
 
 What am I doing wrong here? 
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Highlighting-pre-and-post-tags-not-working-tp4217090.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlighting pre and post tags not working

Try
str name=hl.simple.prelt;emgt;/str
or
str name=hl.simple.pre![CDATA[em]]/str

The bare  and  confuse the XML parsing.

Best
Erick

On Mon, Jul 13, 2015 at 9:19 AM, Paden rumsey...@gmail.com wrote:
 Hello,

 I'm trying to get some Solr highlighting going but I've run into a small
 problem. When I set the pre and post tags with my own custom tag I get an
 XML error

 XML Parsing Error: mismatched tag. Expected: /em.
 Location:
 file:///home/paden/Downloads/solr-5.1.0/server/solr/Testcore2/conf/solrconfig.xml
 Line Number 476, Column 40:   str name=hl.simple.preem/str

 I've seen it done like this on a lot of the other sites and I'm not sure if
 I'm missing an escape character or something. Just to emphasize that I did
 set a POST tag I put it right after the pre in solrconfig.xml like so

 str name=hl.simple.preem/str
 str name=hl.simple.post/em/str

 What am I doing wrong here?



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Highlighting-pre-and-post-tags-not-working-tp4217090.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Planning Solr migration to production: clean and autoSoftCommit

2015-07-13 Thread wwang525

Hi Erick,

That status request shows if the Solr instance is busy or idle. I think
this is a doable option to check if the indexing process completed (idle) or
not (busy).

Now, I have some concern about the solution of not using the default polling
mechanism from the slave instance to the master instance.

The load test showed that the initial batches of requests got much longer
response time than later batches after the Solr server was started up.
Gradually, the performance got much better, presumably due to the cache
being warmed up .

I understand that the indexing process will commit the changes and also auto
warms queries in the existing cache. In this case, the indexing Solr
instance will be in a good shape to serve the requests after the indexing
process is completed.

The question:

When the slave instances poll the indexing instance (master), do these slave
instances also auto warm queries in the existing cache? If it does, then the
polling mechanism will also make the slave instance more ready to server
requests (more performant) at any time.

When we talk about the forced replication solution, are we pushing
/overwriting all the old index files with the new index files? do we need to
restart Solr instance? In addition, will slave instances warmed up in any
way?

If there are too many issues with the force replication, I might as well
work out the incremental indexing option.

Thanks

--
View this message in context:
http://lucene.472066.n3.nabble.com/Planning-Solr-migration-to-production-clean-and-autoSoftCommit-tp4216736p4217102.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Planning Solr migration to production: clean and autoSoftCommit

bq: When the slave instances poll the indexing instance (master), do these slave
instances also auto warm queries in the existing cache

Yes.

bq: When we talk about the forced replication solution, are we pushing
/overwriting all the old index files with the new index files?

I believe so, but don't know the entire details. In our situation this
is what'll
happen anyway since you're cleaning, right? So it really doesn't matter if
you do a fetchindex or just disable/enable polling, the work will essentially
be the same.

bq: do we need to restart Solr instance?
no

bq: In addition, will slave instances warmed up in any
way?

all autowarming will be done.

Really, I'd just start by disabling replication on the master, doing the
indexing, then re-enabling it. The rest should just happen.

Best,
Erick

On Mon, Jul 13, 2015 at 10:48 AM, wwang525 wwang...@gmail.com wrote:
Hi Erick,

That status request shows if the Solr instance is busy or idle. I think
this is a doable option to check if the indexing process completed (idle) or
not (busy).

Now, I have some concern about the solution of not using the default polling
mechanism from the slave instance to the master instance.

The question:

If there are too many issues with the force replication, I might as well
work out the incremental indexing option.

Thanks

Re: Highlighting pre and post tags not working

2015-07-13 Thread Erik Hatcher

Within XML, angle brackets must be escaped as lt; and gt;





 On Jul 13, 2015, at 12:19 PM, Paden rumsey...@gmail.com wrote:
 
 Hello,
 
 I'm trying to get some Solr highlighting going but I've run into a small
 problem. When I set the pre and post tags with my own custom tag I get an
 XML error
 
 XML Parsing Error: mismatched tag. Expected: /em.
 Location:
 file:///home/paden/Downloads/solr-5.1.0/server/solr/Testcore2/conf/solrconfig.xml
 Line Number 476, Column 40:   str name=hl.simple.preem/str
 
 I've seen it done like this on a lot of the other sites and I'm not sure if
 I'm missing an escape character or something. Just to emphasize that I did
 set a POST tag I put it right after the pre in solrconfig.xml like so
 
 str name=hl.simple.preem/str
 str name=hl.simple.post/em/str 
 
 What am I doing wrong here? 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Highlighting-pre-and-post-tags-not-working-tp4217090.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: FieldCache error for multivalued fields in json facets.

2015-07-13 Thread Yonik Seeley

On Mon, Jul 13, 2015 at 1:55 AM, Iana Bondarska yana2...@gmail.com wrote:
 Hi,
 I'm using json query api for solr 5.2. When query for metrics for
 multivalued fields, I get error:
 can not use FieldCache on multivalued field: sales.

 I've found in solr wiki that to avoid using fieldcache I should set
 facet.method parameter to enum.
 Now my question is how can I add facet.enum parameter to query?
 My original query looks like this:
 {limit:0,offset:0,facet:{facet:{facet:{mechanicnumbers_sum:sum(sales)},limit:0,field:brand,type:terms}}}

sum(field) is currently only implemented for single-valued numeric fields.
Can you make the sales field single-valued, or do you actually need
multiple values per document?

-Yonik

Highlighting pre and post tags not working

2015-07-13 Thread Paden

Hello,

I'm trying to get some Solr highlighting going but I've run into a small
problem. When I set the pre and post tags with my own custom tag I get an
XML error

XML Parsing Error: mismatched tag. Expected: /em.
Location:
file:///home/paden/Downloads/solr-5.1.0/server/solr/Testcore2/conf/solrconfig.xml
Line Number 476, Column 40:   str name=hl.simple.preem/str

I've seen it done like this on a lot of the other sites and I'm not sure if
I'm missing an escape character or something. Just to emphasize that I did
set a POST tag I put it right after the pre in solrconfig.xml like so

str name=hl.simple.preem/str
str name=hl.simple.post/em/str 

What am I doing wrong here? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Highlighting-pre-and-post-tags-not-working-tp4217090.html
Sent from the Solr - User mailing list archive at Nabble.com.

Querying Nested documents

2015-07-13 Thread rameshn

Hi, I have question regarding nested documents.My document looks like below,

1234xger00parent   
2015-06-15T13:29:07ZegeDuperhttp://www.domain.com   
zoome1234-images   
http://somedomain.com/some.jpg1:1   
1234-platform-iosios   
https://somedomain.comsomelinkfalse   
2015-03-23T10:58:00Z-12-30T19:00:00Z
  
1234-platform-androidandroid   
somedomain.comsomelinkfalse   
2015-03-23T10:58:00Z-12-30T19:00:00Z  
Right now I can query like
thishttp://localhost:8983/solr/demo/select?q={!parent%20which=%27type:parent%27}fl=*,[child%20parentFilter=type:parent%20childFilter=image_uri_s:*]indent=trueand
get the parent and child document with matching criteria (just parent and
image child document).*But, I want to get all other children*
(1234-platform-ios and 1234-platform-andriod) even if i query based on
image_uri_s (1234-images) although they are other children which are part of
the parent document.Is it possible ?Appreciate your help !Thanks,Ramesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Querying-Nested-documents-tp4217088.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: copying data from one collection to another collection (solr cloud 521)

2015-07-13 Thread Shawn Heisey

On 7/13/2015 1:49 PM, Raja Pothuganti wrote:
 We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu boxes. We 
 currently ingest data into a large collection, call it LIVE. After the full 
 ingest is done we then trigger a delta delta ingestion every 15 minutes to 
 get the documents  data that have changed into this LIVE instance.

 In Solr 4.X using a Master / Slave setup we had slaves that would 
 periodically (weekly, or monthly) refresh their data from the Master rather 
 than every 15 minutes. We're now trying to figure out how to get this same 
 type of setup using SolrCloud.

 Question(s):
 - Is there a way to copy data from one SolrCloud collection into another 
 quickly and easily?
 - Is there a way to programmatically control when a replica receives it's 
 data or possibly move it to another collection (without losing data) that 
 updates on a  different interval? It ideally would be another collection 
 name, call it Week1 ... Week52 ... to avoid a replica in the same collection 
 serving old data.

 One option we thought of was to create a backup and then restore that into a 
 new clean cloud. This has a lot of moving parts and isn't nearly as neat as 
 the Master / Slave controlled replication setup. It also has the side effect 
 of potentially taking a very long time to backup and restore instead of just 
 copying the indexes like the old M/S setup.

SolrCloud works very differently than replication.  When you send an
indexing request, the documents are forwarded to the leader replica of
the shard that will index them.  The leader indexes the documents
locally and sends a copy to all other replicas, each of which
independently indexes those documents.  There's no need to copy finished
indexes (or even index segments) around -- each shard replica builds
itself incrementally in parallel with the others as you index new
documents.  There is no polling interval -- replicas change at nearly
the same time when you do an index update.

Rather than separate collections for each week, you might want to
consider using the implicit router on a single collection and creating a
new *shard* for each week.  This would be done with the CREATESHARD
action on the collections API.  The implicit router does create a new
wrinkle for indexing -- you cannot index to the entire collection ...
you must specifically index to one of the replicas for that specific
shard.  There might be some way to indicate on the update request which
shard it should go to, but I haven't examined SolrCloud requests in that
much detail.

As for copying indexes ... the newest versions of Solr include a
backup/restore API, but if your indexes are very large, this will be
quite slow.

TL;DR info:  With enough digging, you will learn that SolrCloud *does*
require a replication handler, which might be very confusing, since I've
just told you that it's very different from replication.  That handler
is *only* used when a replica requires recovery.  Recovery might be
required because a replica has been down too long, has been newly
created, or some similar situation.  It is NOT used during normal
SolrCloud operation.

Collections are made up of one or more shards.  Shards have one or more
replicas.  Each replica is a core.

https://cwiki.apache.org/confluence/display/solr/How+SolrCloud+Works

There's a lot of info in a small space here.  It will hopefully be
enough for you to find more detail in the Solr documentation, the wiki,
or possibly other locations.

Thanks,
Shawn

copying data from one collection to another collection (solr cloud 521)

2015-07-13 Thread Raja Pothuganti


Hi,
We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu boxes. We 
currently ingest data into a large collection, call it LIVE. After the full 
ingest is done we then trigger a delta delta ingestion every 15 minutes to get 
the documents  data that have changed into this LIVE instance.

In Solr 4.X using a Master / Slave setup we had slaves that would periodically 
(weekly, or monthly) refresh their data from the Master rather than every 15 
minutes. We're now trying to figure out how to get this same type of setup 
using SolrCloud.

Question(s):
- Is there a way to copy data from one SolrCloud collection into another 
quickly and easily?
- Is there a way to programmatically control when a replica receives it's data 
or possibly move it to another collection (without losing data) that updates on 
a  different interval? It ideally would be another collection name, call it 
Week1 ... Week52 ... to avoid a replica in the same collection serving old data.

One option we thought of was to create a backup and then restore that into a 
new clean cloud. This has a lot of moving parts and isn't nearly as neat as the 
Master / Slave controlled replication setup. It also has the side effect of 
potentially taking a very long time to backup and restore instead of just 
copying the indexes like the old M/S setup.

Any ideas of thoughts? Thanks in advance for you help.
Raja

Re: Planning Solr migration to production: clean and autoSoftCommit

2015-07-13 Thread wwang525

Hi Erick,

I think this is good solution. It is going to work although I have not
implemented with Http API which I was able to find in
https://wiki.apache.org/solr/SolrReplication.

In my local machine, a total of 800MB of index files were downloaded
within a minute to another folder. However, transfer the index files across
network could be longer.

I will test it with two-machine scenario.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Planning-Solr-migration-to-production-clean-and-autoSoftCommit-tp4216736p4217122.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: XML File Size for Post.jar

If you have hundreds of files, the post command (SimplePostTool) can
also push a directory of files up to Solr.

(It is called Simple under the hood, but it is far from simple!)

Upayavira

On Mon, Jul 13, 2015, at 09:28 PM, Alexandre Rafalovitch wrote:
Solr ships with XML processing example for DIH in the examples
directory (RSS core). In your case, you will most probably read the
filelist or directory list and then run XML processor as a nested
entity. So, check the nested example at
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/

On 13 July 2015 at 15:12, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
external.ravi.tamin...@us.bosch.com wrote:
I Can break that into smaller files but for other case the number of files
growing in 100s..

Can I Parse XML Files to DIH..? Can you refer few examples..?

Thanks

Ravi

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Monday, July 13, 2015 3:01 PM
To: solr-user
Subject: Re: XML File Size for Post.jar

I don't think you can do files that big. The memory would blow out.
You sure you cannot chunk it into smaller document sets? Or make it a
streaming parsing with DIH in a pull fashion?

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/

On 13 July 2015 at 14:56, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
external.ravi.tamin...@us.bosch.com wrote:

HI, Where I have to change to support the xml file more than 2GB to Index
in Solr, using the simple post tool (post.jar) for Jetty and Tomcat.

Thanks

Ravi

Re: copying data from one collection to another collection (solr cloud 521)

Actually, my question is why do it this way at all? Why not index
directly to your live nodes? This is what SolrCloud is built for.

There's the new backup/restore functionality that's still a work in
progress, see: https://issues.apache.org/jira/browse/SOLR-5750

You an use implicit routing to create shards say, for each week and
age out the ones that are too old as well.

Another option would be to use collection aliasing to keep an
offline index up to date then switch over when necessary.

I'd really like to know this isn't an XY problem though, what's the
high-level problem you're trying to solve?

Best,
Erick

On Mon, Jul 13, 2015 at 12:49 PM, Raja Pothuganti
rpothuga...@competitrack.com wrote:

 Hi,
 We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu boxes. We 
 currently ingest data into a large collection, call it LIVE. After the full 
 ingest is done we then trigger a delta delta ingestion every 15 minutes to 
 get the documents  data that have changed into this LIVE instance.

 In Solr 4.X using a Master / Slave setup we had slaves that would 
 periodically (weekly, or monthly) refresh their data from the Master rather 
 than every 15 minutes. We're now trying to figure out how to get this same 
 type of setup using SolrCloud.

 Question(s):
 - Is there a way to copy data from one SolrCloud collection into another 
 quickly and easily?
 - Is there a way to programmatically control when a replica receives it's 
 data or possibly move it to another collection (without losing data) that 
 updates on a  different interval? It ideally would be another collection 
 name, call it Week1 ... Week52 ... to avoid a replica in the same collection 
 serving old data.

 One option we thought of was to create a backup and then restore that into a 
 new clean cloud. This has a lot of moving parts and isn't nearly as neat as 
 the Master / Slave controlled replication setup. It also has the side effect 
 of potentially taking a very long time to backup and restore instead of just 
 copying the indexes like the old M/S setup.

 Any ideas of thoughts? Thanks in advance for you help.
 Raja

Re: XML File Size for Post.jar

2015-07-13 Thread Alexandre Rafalovitch

Solr ships with XML processing example for DIH in the examples
directory (RSS core). In your case, you will most probably read the
filelist or directory list and then run XML processor as a nested
entity. So, check the nested example at
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/

Can I Parse XML Files to DIH..? Can you refer few examples..?

Thanks

Ravi

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Monday, July 13, 2015 3:01 PM
To: solr-user
Subject: Re: XML File Size for Post.jar

I don't think you can do files that big. The memory would blow out.
You sure you cannot chunk it into smaller document sets? Or make it a
streaming parsing with DIH in a pull fashion?

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/

On 13 July 2015 at 14:56, EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
external.ravi.tamin...@us.bosch.com wrote:

HI, Where I have to change to support the xml file more than 2GB to Index in
Solr, using the simple post tool (post.jar) for Jetty and Tomcat.

Thanks

Ravi

Re: Range Facet queries for date ranges with with non-constant gaps

2015-07-13 Thread JoeSmith

Are there any examples/documentation for IntervalFaceting using dates that
I could refer to?

On Mon, Jul 13, 2015 at 6:36 PM, Chris Hostetter hossman_luc...@fucit.org
wrote:


 : Some of the buckets return with a count of ‘0’ in the bucket even though
 : the facet.range.min is set to ‘1’.  That is not the primary issue

 facet.range.min has never been a supported (or documented) param -- you
 are most likeley trying to use facet.mincount (which can be specified
 per field as a top level f.my_field_name.facet.mincount, or as a
 localparam, ex: facet.range={!facet.mincount=1}my_field_name

 : though. What I would like to get back are buckets of unevenly spaced
 : gaps.  For example, counts for the last 7 days, last 30 days, last 90
 : days.

 what you are describing is exactly what the Interval Faceting feature
 provides...


 https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-IntervalFaceting


 -Hoss
 http://www.lucidworks.com/

Why I get a hit on %, , but not on !, @, #, $, ^, *

2015-07-13 Thread Steven White

Hi Everyone,

I think the subject line said it all.  Here is the schema I'm using:

fieldType name=my_text class=solr.TextField positionIncrementGap=100
autoGeneratePhraseQueries=true
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=lang/stopwords_en.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=1 splitOnCaseChange=0 splitOnNumerics=1
stemEnglishPossessive=1 preserveOriginal=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType

I'm guessing this is due to how solr.WhitespaceTokenizerFactory works and
those that it is not indexing are removed because they are considered
white-spaces?  If so, how can I include %, , etc. into this none-indexed
list?  I would rather see all these not indexed vs some are and some are
not causing confusion to my users.

Thanks

Steve

Re: copying data from one collection to another collection (solr cloud 521)

2015-07-13 Thread Raja Pothuganti

Thank you Erick
Actually, my question is why do it this way at all? Why not index
directly to your live nodes? This is what SolrCloud is built for.
You an use implicit routing to create shards say, for each week and
age out the ones that are too old as well.


Any updates to EXISTING document in the LIVE collection should NOT be
replicated to the previous week(s) snapshot(s). Think of the snapshot(s)
as an archive of sort and searchable independent of LIVE. We're aiming to
support at most 2 archives of data in the past.


Another option would be to use collection aliasing to keep an
offline index up to date then switch over when necessary.

Does offline indexing refers to this link
https://github.com/cloudera/search/tree/0d47ff79d6ccc0129ffadcb50f9fe0b271f
102aa/search-mr


Thanks
Raja



On 7/13/15, 3:14 PM, Erick Erickson erickerick...@gmail.com wrote:

Actually, my question is why do it this way at all? Why not index
directly to your live nodes? This is what SolrCloud is built for.

There's the new backup/restore functionality that's still a work in
progress, see: https://issues.apache.org/jira/browse/SOLR-5750

You an use implicit routing to create shards say, for each week and
age out the ones that are too old as well.

Another option would be to use collection aliasing to keep an
offline index up to date then switch over when necessary.

I'd really like to know this isn't an XY problem though, what's the
high-level problem you're trying to solve?

Best,
Erick

On Mon, Jul 13, 2015 at 12:49 PM, Raja Pothuganti
rpothuga...@competitrack.com wrote:

 Hi,
 We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu
boxes. We currently ingest data into a large collection, call it LIVE.
After the full ingest is done we then trigger a delta delta ingestion
every 15 minutes to get the documents  data that have changed into this
LIVE instance.

 In Solr 4.X using a Master / Slave setup we had slaves that would
periodically (weekly, or monthly) refresh their data from the Master
rather than every 15 minutes. We're now trying to figure out how to get
this same type of setup using SolrCloud.

 Question(s):
 - Is there a way to copy data from one SolrCloud collection into
another quickly and easily?
 - Is there a way to programmatically control when a replica receives
it's data or possibly move it to another collection (without losing
data) that updates on a  different interval? It ideally would be another
collection name, call it Week1 ... Week52 ... to avoid a replica in the
same collection serving old data.

 One option we thought of was to create a backup and then restore that
into a new clean cloud. This has a lot of moving parts and isn't nearly
as neat as the Master / Slave controlled replication setup. It also has
the side effect of potentially taking a very long time to backup and
restore instead of just copying the indexes like the old M/S setup.

 Any ideas of thoughts? Thanks in advance for you help.
 Raja

Re: Range Facet queries for date ranges with with non-constant gaps

2015-07-13 Thread Chris Hostetter


: Some of the buckets return with a count of ‘0’ in the bucket even though 
: the facet.range.min is set to ‘1’.  That is not the primary issue 

facet.range.min has never been a supported (or documented) param -- you 
are most likeley trying to use facet.mincount (which can be specified 
per field as a top level f.my_field_name.facet.mincount, or as a 
localparam, ex: facet.range={!facet.mincount=1}my_field_name

: though. What I would like to get back are buckets of unevenly spaced 
: gaps.  For example, counts for the last 7 days, last 30 days, last 90 
: days.

what you are describing is exactly what the Interval Faceting feature 
provides...

https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-IntervalFaceting


-Hoss
http://www.lucidworks.com/

Range Facet queries for date ranges with with non-constant gaps

2015-07-13 Thread JoeSmith

I am trying to do a range facet query for on date ranges.  The query below
executes and returns results (almost) as desired for 60DAY buckets.



http://localhost:8983/solr/mykeyspace2.user_data/select?wt=jsonfq:id=7465033q=*:*rows=0indent=truefacet=onfacet.range=login_eventfacet.range.gap=%2B60DAYfacet.range.start=NOW/YEARfacet.range.end=NOW/MONTH%2B1MONTHfacet.range.min=1



Some of the buckets return with a count of  ‘0’ in the bucket even though
the facet.range.min is set to ‘1’.  That is not the primary issue though.
What I would like to get back are buckets of unevenly spaced gaps.  For
example,  counts for the last 7 days, last 30 days, last 90 days.


What would be the best way to accomplish this?And is there something
wrong with facet.range.min usage?

Re: Querying Nested documents

Hi Rameshn,
I would suggest you to rewrite your mail.
It is really heavy to understand!
Try to format your document and nested document in a nice way ( remember a
document is a map field- value), let's try to not over complicate the
things !

Furthermore, try to express the query as well not encoded.
It will let us help you much more efficiently, without loosing 10 minutes
decoding the mail :)

Cheers

2015-07-13 17:03 GMT+01:00 rameshn ramesh.nuthalap...@gmail.com:

 Hi, I have question regarding nested documents.My document looks like
 below,
 1234xger00parent
 2015-06-15T13:29:07ZegeDuperhttp://www.domain.com
 zoome1234-images
 http://somedomain.com/some.jpg1:1
 1234-platform-iosios
 https://somedomain.comsomelinkfalse
 2015-03-23T10:58:00Z-12-30T19:00:00Z
 1234-platform-androidandroid
 somedomain.comsomelinkfalse
 2015-03-23T10:58:00Z-12-30T19:00:00Z
 Right now I can query like

 thishttp://localhost:8983/solr/demo/select?q={!parent%20which=%27type:parent%27}fl=*,[child%20parentFilter=type:parent%20childFilter=image_uri_s:*]indent=trueand
 get the parent and child document with matching criteria (just parent and
 image child document).*But, I want to get all other children*
 (1234-platform-ios and 1234-platform-andriod) even if i query based on
 image_uri_s (1234-images) although they are other children which are part
 of
 the parent document.Is it possible ?Appreciate your help !Thanks,Ramesh



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Querying-Nested-documents-tp4217088.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Persistence problem with swapped cores after Solr restart -- 4.9.1

2015-07-13 Thread Shawn Heisey

On Solr 4.9.1 with core discovery, I seem to be having trouble with core
swaps not persisting through a full Solr restart.

I apologize for the fact that this message is lean on details ... I've
seen the problem twice now, but I don't have any concrete before/after
information about what's in each core.properties file.  I am attempting
to set up the scenario again and gather that information.

The entire directory structure is set up as a git repo, so I will be
able to tell if any files (like core.properties) are modified for the
rebuild/swap that I have started.  The repo shows no changes at the
moment, but I have done several of these rebuild/swap operations, so
even if core.properties is being correctly updated, it might just have
landed back on the original configuration.

I have another copy of my index using Solr 4.7.2 with the old solr.xml
format that seems to have no problems with core swapping and
persistence.  That works differently, though -- all cores are defined in
solr.xml rather than with core.properties files.

When I first set up these Solr instances, I don't recall having this
problem, but full Solr restarts are really rare, so it's possible I just
didn't create the right circumstances.

Thanks,
Shawn

RE: Range Facet queries for date ranges with with non-constant gaps

2015-07-13 Thread Reitzel, Charles

Try facet.mincount=1.   It will still apply to range facets.

-Original Message-
From: JoeSmith [mailto:fidw...@gmail.com] 
Sent: Monday, July 13, 2015 5:56 PM
To: solr-user
Subject: Range Facet queries for date ranges with with non-constant gaps

I am trying to do a range facet query for on date ranges.  The query below 
executes and returns results (almost) as desired for 60DAY buckets.

http://localhost:8983/solr/mykeyspace2.user_data/select?wt=jsonfq:id=7465033q=*:*rows=0indent=truefacet=onfacet.range=login_eventfacet.range.gap=%2B60DAYfacet.range.start=NOW/YEARfacet.range.end=NOW/MONTH%2B1MONTHfacet.range.min=1

Some of the buckets return with a count of  ‘0’ in the bucket even though the 
facet.range.min is set to ‘1’.  That is not the primary issue though.
What I would like to get back are buckets of unevenly spaced gaps.  For 
example,  counts for the last 7 days, last 30 days, last 90 days.

What would be the best way to accomplish this?And is there something
wrong with facet.range.min usage?

*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*

Re: Querying Nested documents

2015-07-13 Thread Steve Rowe

Hi rameshn,

Nabble has a nasty habit of stripping out HTML and XML markup before sending 
your mail out to the mailing list - see your message quoted below for how it 
appears to people who aren’t reading via Nabble.

My suggestion: directly subscribe to the solr-user mailing list[1] and avoid 
Nabble.  (They’ve known about the problem for many years and AFAICT have done 
nothing about it.)

Steve

[1] https://lucene.apache.org/solr/resources.html#mailing-lists

 On Jul 13, 2015, at 12:03 PM, rameshn ramesh.nuthalap...@gmail.com wrote:
 
 Hi, I have question regarding nested documents.My document looks like below,  
   
 1234xger00parent  
  
 2015-06-15T13:29:07ZegeDuperhttp://www.domain.com 
   
 zoome1234-images   
 http://somedomain.com/some.jpg1:1   
 1234-platform-iosios   
 https://somedomain.comsomelinkfalse   
 2015-03-23T10:58:00Z-12-30T19:00:00Z  
 
 1234-platform-androidandroid   
 somedomain.comsomelinkfalse   
 2015-03-23T10:58:00Z-12-30T19:00:00Z  
 Right now I can query like
 thishttp://localhost:8983/solr/demo/select?q={!parent%20which=%27type:parent%27}fl=*,[child%20parentFilter=type:parent%20childFilter=image_uri_s:*]indent=trueand
 get the parent and child document with matching criteria (just parent and
 image child document).*But, I want to get all other children*
 (1234-platform-ios and 1234-platform-andriod) even if i query based on
 image_uri_s (1234-images) although they are other children which are part of
 the parent document.Is it possible ?Appreciate your help !Thanks,Ramesh
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Querying-Nested-documents-tp4217088.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Querying Nested documents

2015-07-13 Thread Ramesh Nuthalapati

(Duplicate post as the xml is not formatted well in nabble, so posting
directly to the list)

Hi, I have question regarding nested documents.

My document looks like below,

 doc
field name=id1234/field
field name=pk_idxger/field
field name=title_t![CDATA[title]]/field
field name=description_t![CDATA[this is a test]]/field
field name=specCount_i0/field
field name=viewCount_i0/field
   field name=lastModifiedDate_dt2015-06-15T13:29:07Z/field
field name=vert_id_sege/field
field name=vert_name_sDuper/field
field name=vert_url_shttp://www.domain.com/field
field name=sere_szoome/field
field name=typeparent/field
doc
field name=id1234-images/field
field name=image_uri_shttp://somedomain.com/some.jpg/field
field name=image_flatten_s1:1/field
/doc
doc
field name=id1234-platform-ios/field
field name=platform_sios/field
field name=downloadU_shttps://somedomain.com/field
field name=link_ssomelink/field
field name=authRequired_sfalse/field
field name=startDate_s2015-03-23T10:58:00Z/field
field name=endDate_s-12-30T19:00:00Z/field
/doc
doc
field name=id1234-platform-android/field

field name=platform_sandroid/field
field name=downloadU_ssomedomain.com/field
field name=link_ssomelink/field
field name=authRequired_sfalse/field
field name=startDate_s2015-03-23T10:58:00Z/field
field name=endDate_s-12-30T19:00:00Z/field
/doc
/doc

Right now I can query like this

http://localhost:8983/solr/demo/select?q=
{!parent%20which=%27type:parent%27}fl=*,[child%20parentFilter=type:parent%20childFilter=image_uri_s:*]indent=true

and get the parent and child document with matching criteria (just parent
and image child document).

*But, I want to get all other children* (1234-platform-ios and
1234-platform-andriod) even if i query based on image_uri_s (1234-images)
although they are other children which are part of the parent document.

Is it possible ?

Appreciate your help !

Thanks,
Ramesh

Re: Querying Nested documents

2015-07-13 Thread rameshn

My sincere Apologies.

Re-submitted directly to the list. 

Thank you.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Querying-Nested-documents-tp4217088p4217166.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: FieldCache error for multivalued fields in json facets.